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ABSTRACT 


Expressions  for  the  exact  power  of  the  two-sample  Mann-Whitney 
Wilcoxon  U  test  procedure  against  alternatives  of  exponential  and 
rectangular  populations  have  been  derived.  Several  examples  for  total 
sample  sizes  of  11  and  15  have  been  compared  with  Mood1  a  median  test. 
Mood*  a  test  is  more  powerful  than  the  U  test  in  all  instances  in 
which  the  number  of  observations  from  the  null  population  exceeds  the 
number  from  the  alternative  population.  The  converse  is  true  when 
the  number  of  observations  from  the  null  population  is  less  than  the 
number  from  the  alternative. 

Expressions  for  the  asymptotic  efficiency  of  the  Mann-Whitney- 
Wilcoxon  U  test  relative  to  Mood's  and  Massey's  tests  and  the  like¬ 
lihood  ratio  test  have  been  derived  for  exponential  populations.  The 
asymptotic  efficiency  of  the  U  test  relative  to  the  likelihood 
ratio  test  is  zero. 

Mood's  and  Massey's  test  procedures  for  two  samples  have  been 
extended  to  the  case  of  discriminating  among  c  populations  on  the 
basis  of  c  ordered  samples.  Expressions  for  the  exact  power  have 
been  derived  for  Mood*  s  test  with  exponential  and  rectangular  popu¬ 
lations  and  for  Massey's  test  with  exponential  populations.  With 
exponential  translation  alternatives, the  tests  are  biased. 

The  exact  null  distributions  of  goodness  of  fit  tests  for  one¬ 
way  and  two-way  contingency  tables  indicate  that  even  for  samples  as 
small  as  ten,  the  exact  distribution  is  closely  approximated  by  a 
chi-square  distribution  with  the  appropriate  degrees  of  freedom. 

*  * . 
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SUMMARY 


Many  rank  tests  are  available  to  discriminate  between  two 
populations  on  the  basis  of  two  ordered  samples  f row,  the  populations. 
Of  them.  Mood's  test  procedure  [16]  based  on  the  median  of  the  com¬ 
bined  samples,  Massey's  extension  of  Mood's  test  [15j  based  on  frac- 
tlles,  and  the  Mann-Whitney-Wilcoxon  U  test  procedure  [14]  based 
on  the  number  of  times  an  observation  from  the  second  sample  exceeds 
an  observation  from  the  first  sample,  have  much  to  commend  them  as 
quick  tests. 

The  exact  powers  of  Mood' s  and  Massey' s  tests  against  alter- 
i  natives  of  translation  in  normal  and  exponential  populations  and 

change  in  location  and  scale  in  a  rectangular  population  have  al¬ 
ready  been  investigated  by  Barton  [2]  and  Chakravarti,  Leone,  and 
Alanen  [13].  Also,  the  exact  power  of  the  U  test  against  the  al¬ 
ternative  of  translation  in  the  normal  population  has  been  computed 
by  Dixon  [6]. 

In  Chapter  I,  expressions  for  the  exact  power  of  the  two- 
sample  Mann-Whitney-Wilcoxon  U  test  procedure  against  alternatives 
of  exponential  and  rectangular  populations  have  been  obtained. 

Several  examples  of  the  power  for  total  sample  sizes  of  11  and  15 
have  been  compared  with  similar  results  obtained  from  Mood's  median 
test  procedure.  The  results  of  the  comparison  indicate  that  for 
these  two  alternatives: 
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i)  If  the  number  of  observations  from  the  null  population 
is  less  than  the  number  from  the  alternative,  the 
Mann-Whitney-Wilcoxon  U  test  is  more  powerful  than 
Mood's  median  test. 

ii)  If  the  number  of  observations  from  the  null  population 
ia  greater  than  the  number  from  the  alternative,  then 
Mood' s  test  is  more  powerful  than  the  Mann-Whitney- 
Wilcoxon  test. 

Hi)  If  the  number  of  observations  from  both  populations 
are  the  same,  then  both  test  procedures  give  approx¬ 
imately  the  same  power. 

In  Chapter  II,  expressions  for  the  asymptotic  efficiency  of 
the  Mann-Whitney-Wilcoxon  U  test  relative  to  Mood' a  and  Massey' s 
tests  and  the  likelihood  ratio  test  have  been  derived  for  exponen¬ 
tial  populations.  The  asymptotic  efficiency  of  the  Mann-Whitney- 
Wilcoxon  test  relative  to  the  likelihood  ratio  test  is  zero,  but  in 
the  case  of  Mood's  and  Massey's  tests  the  resulting  expressions  are 
non-zero . 

Chapter  III  is  devoted  to  extending  Mood' s  two-sample  test 
procedure  to  the  case  of  distinguishing  among  c(c  >  2)  populations 
on  the  basis  of  c  ordered  samples  from  the  population^.  The 
appropriate  expressions  for  the  power  functions  for  exponential  and 
rectangular  alternatives  have  been  derived,  and  typical  results  for 
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the  case  of  three  samples  from  exponential  populations  indicate  that 
the  tet,t  can  be  biased,  especially  when  the  level  of  significance  is 
small* 

Similarly,  in  Chapter  IV,  Massey1  s  two-sample  test  procedure 
is  extended  to  the  case  of  distinguishing  among  c(c  >  2)  popula¬ 
tions  on  the  basis  of  c  ordered  samples  from  the  populations.  Ex¬ 
pressions  for  the  exact  power  have  been  derived  for  the  exponential 
translation  alternatives,  and  again,  typical  results  for  the  case  of 
three  samples  indicate  that  the  test  can  be  biased,  especially,  when 
the  level  of  significance  is  small. 

In  Chapter  V,  the  exact  null  distribution  of  goodness  of  fit 
tests  for  one-way  and  two-way  classifications  is  considered. 

Typical  results  are  computed  and  are  compared  with  the  usual  chi- 
square  approximation.  In  general,  the  chi-square  distribution  with 
the  appropriate  degrees  of  freedom  closely  approximates  the  exact 
distribution,  even  for  total  sample  sizes  as  small  as  ten.  In 
addition,  the  exact  power  of  the  test  statistic  arising  from  a 
one-way  classification  has  been  computed  for  several  alternatives, 
and  the  results  have  been  compared  with  both  non-central  and 
central  chi-square  approximations.  The  results  of  the  comparison 
indicate  tl:at  both  approximations  tend  to  overestimate  the  power 
for  small  sample  sizes,  however,  both  approximations  differ  at 
most  by  one  percent. 
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CHAPTER  I 


EXACT  POWER  OF  SOME  TESTS  BASED  ON  THE  MANN-WHITNEY  U  STATISTIC 


1*1  Introduction. 

Many  rank  tests  are  available  to  discriminate  between  two 
populations  on  the  basis  of  two  ordered  samples  from  the  populations. 
Of  them,  Mood*  s  test  [16],  based  on  the  median  of  the  combined 
samples,  Massey’s  extension  of  Mood’s  test  [15],  based  on  fractiles, 
and  Mann-Whitney' s  U  test  [14],  based  on  the  number  of  times  an 
observation  from  the  second  sample  exceeds  an  observation  from  the 
first  sample,  have  much  to  commend  them  as  quick  tests. 

The  exact  powers  of  Mood’s  and  Massey’s  tests  against  alter¬ 
natives  of  translation  in  the  normal  and  exponential  distributions 
and  change  in  location  and  scale  in  the  rectangular  distribution 
have  already  been  computed  by  Barton  [2]  and  Chakravarti,  Leone 
and  Alanen  [13]*  Also  the  exact  power  of  the  Mann-Whitney  U  test 
against  the  alternative  of  translation  in  the  normal  distribution 
has  been  computed  by  Dixon  [  6 ] . 

The  purposes  of  the  investigation  in  this  chapter  sure: 

(i)  To  derive  the  exact  power  functions  for  the  Mann-Whitney  U  test 
of  two  samples  against  alternatives  of  exponential  and  rec¬ 
tangular  populations. 

(ii)  To  tabulate  and  compare  these  results  with  those  obtained  for 
Mood’ s  median  test  in  order  to  evaluate  if  there  is  any  result¬ 
ant  gain  in  the  use  of  the  Mann-Whitney  U  test.  The  latter  is 
more  elaborate  than  the  former. 
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1*2.1  The  Two  Sample  Problem  -  Mann-Whitney  U  Test. 

Let  X^,  Xj*  ...»  XR  and  Y^,  •••»  be  independent¬ 

ly  distributed  with  continuous  cumulative  distribution  functions 
(cdfs)  F  and  G  respectively.  We  want  to  test  the  hypothesis 

Ho  :  F(x)  =  G(x)  , 
against  the  alternative  given  by 

H1  :  F(x)  >  G(x)  . 


Let  n  =  n^  +  denote  the  size  of  the  combined  sample  and  < 

Z(2)  <  •••  <  Z^  be  the  combined  ordered  X's  and  Y’s.  This 
ordering  is  unique  with  probability  1,  since  Pr{Xj  =  = 

Pr{Yi  =  Y^i}  =  Pr{Xj  =  Y^}  =  0  due  to  the  assumption  of  continuity 


of  F  and  G. 

The  test  originally  proposed  by  Wilcoxon  [22]  is  based  on 
the  statistic  T  which  is  the  sum  of  the  ranks  of  the  Y’s  in  the 
combined  ordered  sample.  A  test  of  size  <*  based  on  Wilcoxon' s 
statistic  is: 

reject  H  if  T  >  t,  and 

accept  H  if  T  <  t  ,  where  Pr[T  >  t.  I  H  }  <  °(  . 

This  test  was  modified  by  Mann  and  Whitney  [14]  by  defining 

a  statistic  U  which  is  equal  to  the  number  of  times  a  Y  precedes 

an  X  in  the  combined  ordered  sample.  Then, a  test  of  size  o< 
based  on  the  Mann-Whitney  U  statistic  is 

reject  H  if  U  <  u,  and 

accept  H0  if  U  >  ,  where  Pr{U  <  |  HQ)  <  . 


This  U  statistic  is  related  to  Wilcoxon' s  T  statistic  by 


U  =  lXjCiXj  +1)  -  T  ,  (1.1) 

which  gives  a  simple  way  of  computing  U  from  the  observed  value  of 


T.  The  exact  distribution  of  U  under  the  null  hypothesis  Hq  has 


been  tabulated  by  Mann  and  Whitney  [14] • 

1.2.2  The  Null  Distribution. 

Mann  and  Whitney  have  shown  that  the  null  distribution  can 
be  calculated  recursively  from 

°2 


n1  +  n2  Pn1-l,n2 ^u-n2^  +  n1  +  n2  ^,^-1'^  * 


with 


P0,n2^  =0 


P  0(u)  =  0 

^,0 


if  u  >  0  , 


(1.2) 


po^(u) 


PV°‘U)  =1 


if  u  =  0  , 


and 


P  _  (u)  =  0 
nl,n2 


if  u  <  0  , 

where  P  (u)  =  Prfu  =  u  I  H  ]  for  samples 
nl,n2  V  o 

However,  it  would  be  desirable  to  be  able  to  express  the  null  distri- 


of  size  and  n^  • 


bution  in  closed  form  and  to  simultaneously  derive  a  joint  density 
function  which  could  be  used  to  calculate  the  exact  power  under 
fixed  alternatives. 
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Let  us  first  consider  that  the  set  [y^  |  i  =  1,  2,  . n^} 
has  been  chosen  from  G  noting  that  there  are  n^  factorial  ways 
of  obtaining  the  set*  Next  we  order  the  set  and  then  compute  the 
probability  of  choosing  a  set  [x^  |  j  =  1,  2,  n^J  from  F 

such  that  a  specific  value  for  U  is  obtained.  (That  is,  we  want 
an  expression  for  the  joint  distribution  of  U  and  the  Y’s.)  For 
simplicity,  we  will  first  consider  the  special  cases  of  n^  =  1,  2,  3 
and  then  generalize  the  results  to  the  case  of  arbitrary  For 

convenience,  we  define  the  following  set  of  symbols  to  simplify  the 


notation* 


Let  [i^j  be  an  arbitrary  set  of  integer  variables.  Define 
k-l 

(a)  §  =  u  -  2  (x+l)i  for  k  >  1,  with  §  =  u, 

K  i=i  1  1 

k-l 

(b)  Xk  =  $k  +  2  ±i  for  k  >  1,  with  =  5lf 

JL  i 

k-l 

(o)  c<k  =  -  \k  =  -  u  +  2  i  i  ,  for  k  >  1,  c(^=  n^-\^, 

JL  “1 

(d)  =  -  min  (0,  oc^/k) ,  where  a  fraction  such  as  a/b 

denotes  the  largest  integer  contained  in  the  quotient 


of  a  divided  by  b, 


(e)  L  ^  ±\  =  n  I  n  i  ! 

V  — »  1  ll=l 


(n,  -  2  i  )1 
.  i=l  1 


(f)  6 , 


1  for  i  =  j  , 


J  [0  for  i  /  j  .  (1.3) 

For  simplicity  throughout,  F(y^)  and  G(y.)  will  be  written  as 
F^  and  G^  respectively. 
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Let  a,  s  1*  then  we  have  one  value  of  y  say  y^,  and  we 
want  to  choose  u  values  of  x  greater  than  y^,  and  n^-u 
values  less  than  y^.  Since  F  is  the  cumulative  distribution 
function  of  x,  we  get: 


(1.4) 


(1.5) 


/n.\  n, -u  u  dG, 

M».  *i> 

Using  the  special  notation,  this  expression  becomes 

/ni\  Si 

h(u*Ji)  =  \,o(  tij?l  (1*Fl) 

where  6.  -  is  the  Kronecker  delta  defined  by  (1.3f)  . 

P^»u 

If  nj  =  2,  we  want  to  choose  n^  valu6e  of  x  from  F  such 
that  the  total  number  of  x' s  greater  than  y^  and  y2  is  equal 
to  u.  This  can  be  accomplished  in  several  ways,  noting  that  each 
value  of  x  greater  than  y2  is  counted  twice  in  generating  the 
value  of  u.  The  resulting  joint  distribution  of  u,  y^,  and  y2  is 


h(u>  7V  y2)  =  21  2  ^u-2i^, 


,-u+i. 


u-2t. 


F. 


1 1  -<VF1> 


i,  dG,  dG- 

<Wa>  5^  5yJ  ’ 


(1.6) 


where  denotes  the  number  of  x1  s  that  are  greater  than  y^* 

The  sum  over  i^  includes  all  permissible  values  of  such  that 
none  of  the  exponents  in  the  expression  become  negative.  Thus, these 
restrictions  on  the  allowable  values  of  i^  can  be  restated  in  the 


follov.'i.ig  form: 


(i)  u  -  2^  >  0  =>  lx  <  u,/2  =  5/2  ,  and 

(ii)  n^  -  u  +  i^  >  0  >  -  min(0,  n^-u)  =  0^  or  (>2  =  0  • 

These  results  may  be  combined  together  to  yield 

01  <  <  5/2  . 

Recalling  that  @2  ~  0  whenever  >  0^,  (1.6)  can  be  written  in 

the  following  form: 

V2 

h<»,  7V  y2)  =  2'.  I  \  0 

ix=o 

(1.7) 

Similarly  for  =  3>  we  want  to  choose  n^  values  of 

x  from  F  such  that  the  total  number  of  x! s  greater  than  y^, 

y^,  and  y^  is  equal  to  u*  Again, those  values  of  x  between 
y^  and  y^  are  counted  twice,  while  those  greater  than  y^  are 
counted  three  times.  The  resulting  joint  density  function  of  u, 
y^,  y 2 9  and  y^  under  these  circumstances  is 

5/2  5/3 

h(u,  yv  y2,  y/  =  3'.  £  Z  60  ,0  (  -t  i  > 

i,=0  i„=0  J  V  i  x 


So 

F1  3<VF1> 


/ni  V 
r*2*  il j 


«5  50  i-i  as,  <£, 

2<FrFi>  2(1-F2)  1  J  £ 


<F3-F2> 


dG1  c£2  dG^ 
d7l  dy2  dy3 


(1.8) 


where  i^  denotes  the  number  of  x' s  greater  than  y2  and  less 
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than  y^  and  1^  denotes  the  number  of  x' s  greater  than  jy. 

The  joint  density  function  for  the  general  case  can  be  found 
by  using  techniques  similar  to  those  used  in  the  previous  cases. 

This  argument  yields 


h(u,  y,, 


“•I  l  *0  ,0 

i  ~=o  i  ,=o  V 


ni 


« 

F1  "?(P2-P1) 


(f3-f2) 


“F„  _x) 


2  °2 


*nu-2  4iu-l 

’  (1-F  }  * 

*2 


(1.9) 


Now  the  distribution  of  u  under  the  null  hypothesis:;  F  -G, 

can  be  found  by  integrating  the  y*  s  over  the  range  -  +  <  y^  <  ••• 

<  yn  <  •  •  To  simplify  the  integration,  we  transform  the  variables 
n2 

of  integration  from  y^  to  F(y^)  =  F^,  and  the  new  range  of  in¬ 
tegration  is  0  <  F1  <  . <  F  <1*  We  will  first  consider  the 
x  n2 

special  cases  of  ^  =  1>  2,  3  and  then  extend  the  results  to  the 
general  case . 

For  n^  =  1,  we  substitute  F^  =  in  (1.4)  and  integrate. 
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This  yields 

'p0(u)  *  /  V,o Cu)  h1  U(1-F1>”  *  ‘sV^D  (1.10) 

o  1  N  1 

Again  in  the  case  n2  =  2,  we  substitute  ^  i  =  1,  2  into 

(1.7)  and  integrate.  Thus 

V2 

*o(u)  =  2i  I  V 

il=°  2 

(1.11) 

F1 

Letting  Q  =  —  in  the  inner  integral,  (1.11)  becomes 
r2 

Vs 

V“>  =  2'-  Z\,oQ\ 

ix=o  2  v  . 

which  yields  two  complete  Beta  functions  upon  integration.  The  re¬ 
sulting  expression  in  simplified  form  is 
21  n,'. 

*o(u)  =  (^'f  2)'.'  max  t°i  (V2)  -  h  +  !]  .  (1.13) 

Similarly  the  case  for  n2  =  3  yields  three  complete  Beta 
function  integrals  that  can  be  simplified  to 

V2 

3 1  n  l  x 

voM  *  <r-»  3)1  l  “*  to.  V3'  -  >2  *  1)  . 


1  1 


«,+^+l 


Q  *(1-Q)  ‘  dQ  F.  2  '2  »  '_1 


0  0 


(1-F2)  AdF2, 
(1.12) 


(1.14) 


/ 


The  general  case  for  arbitrary  can  be  developed  in  the 
same  manner  starting  with  (1.9).  The  resulting  integrals  simplify 
to  complete  Beta  functions.  These  results  can  be  simplified 

to  yield 


<P0(u)  = 


Hj2  5n  -^V1* 

nl'*  n2*  r*  n 

=  (nx  +  n2)V  l  -•  L  “*  [0» 


(1.15) 


As  a  check,  cpo(u)  was  evaluated  for  the  cases  =  8,  1  <  <  8, 

and  0  <  u  <  n^r^,  which  showed  complete  agreement  with  the  results 
given  by  Mann  and  Whitneys1  recursive  formulas  (1«2)  • 

1.3.1  Po*er  of  U  Test  Against  the  Alternatives  of  Translation  in 
tho  Exponential  Population > 

Here  the  alternative  hypothesis  considered  is 


(1.15) 


Let  cd  (u)  denote  the  probability  of  U  taking  an  the 
value  u  given  that  Hfi  is  true.  Then 

<Pa(u)  =  J  •••  J  h(u>  7i»  •••*  7n)  dyj  •••  <lyn^  (1.16) 

where  h  is  given  by  (1.9).  We  will  first  consider  the  results 
for  the  special  cases  of  =  1,  2,  3  and  then  will  extend  the 


F(x) 

=  1  -  e“x 

x  >  0  , 

=  0 

x  <  0  f 

G(y) 

=  1  -  e_(y_a) 

y  >  a  , 

=  0 

y  <  a  ,  where  a  >  0 

/ 


results  to  the  general  case.  For  convenience  of  notation  we  further 


‘n  =  e~  »  7  =  1-7)  . 


(1.17) 


Then  for  n^  =  1,  the  function  to  be  evaluated  under  the  alterna¬ 
tive  hypothesis  is 


n.,  \ 


*»  =  J  vUJFil(1"Fi)  ldGi  • 


(1.18) 


Since  -  0  for  y^  <  a,  the  range  of  integration  on  y^  can  be 
reduced  to  a  <  y^  <  •  ,  and  ve  can  substitute 

F-^  =  y  +  T|G^  =  1  -  T)(l-G^)  valid  for  a  <  y^  <  •  , 

into  (1.18).  This  yields 

">atu>  =  fy'1  \,0  J<y*1  o/Vo/1  dDj  .  (1.19) 

Now  if  we  expand  (y  +  T|  G^)  by  the  binomial  theorem  and  inter¬ 
change  the  order  of  summation  and  integration,  we  get 

T.(U)  =  (<i)\>0  l  fir  ^  I  V1'0/1  *1  . 

'  V  *=°V  '  0  (1.20) 

The  resulting  Beta  function  can  be  simplified  to  yield 

<pa(u)  =  Bll  11  6  l  V  V1  j  /[(Vv)l(Vv+1)’J  . 

v=0  (1.21) 
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Likewise  in  the  case  n2  -  2,  (1.16)  becomes 

■  21  E  v 

V0 

(1-F2)  a  df^d^  ,  (1.22) 

where  the  range  of  the  y'  s  has  been  reduced  to  a  <  y^  <  y2 <  •  » 
since  G  =  0  for  y  <  a.  Now  substituting 

Fj  =  V  +  ^  Gj  =  1  -  -  Gj)  valid  for  a  <  y }  <  -  (1.23) 

°<2 

in  (1.22)  and  expanding  the  term  {y  +  1)  G^)  *  we  get 


0  0 


(1.24) 


By  transforming  the  variables  of  integration  we  get  two  complete 
Beta  functions  which  can  be  simplified  to 


%<■>  -  * 21  't  Vo  Z  l*2'"  l2™1] 


i^O  -  v=0 


[(o(2-v)'.(52+v+i1+2)'.]"1  . 


(1.25) 
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In  the  case  -  3>  we  get  three  complete  Beta  functions. 
These  can  be  simplified  to  yield 


V3 


*  v 31  E  E ».  ,0  E  v 3’  * 


<=(,-v  X^+v 


ij=o  i2=0 


r=  0 


[(o<3-v)'.(X3+v+3)1]"1 


(1.26) 


Following  a  development  along  the  lines  used  in  the  previous 
special  cases,  we  get  in  the  general  case  complete  Beta  func¬ 
tions.  These  simplify  to  give 

Kx/2  5n2-2/(n2"1)  Kn2-l/n2 


cpa(u)  =  n^.  ajl  l  ...  [ 

11=0  in2-2=0 


l 


K  *o 


i  1=0  *2 

n2-1 


n2  °*n  “v  *n  +v 

^  Y  2  H  2  /[(«.  -v)l(An  +v+n2)'.] 

v=0  *  2 


(1.27) 


The  power  of  the  test  can  be  computed  from  (1.27)  by  evalu¬ 


ating 


u 


ft  “  <  >*«  I  H.  )  *  E  *.<“> 

u=0 


(1.28) 


where  is  determined  from  <=(  ,  the  level  of  significance,  by 

evaluating 

Pr{  U  <  Uo(  |  Hq  }  <  =<  . 
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1.4.1  a  +  9  *  1. 

As  before,  we  will  consider  first  the  three  special  cases, 

=  1,  2,  3  and  then  will  extend  the  results  to  the  general  case. 
Then  for  =  1>  the  function  to  be  evaluated  under  the 
alternative  hypothesis  is: 


7i=  " 


v(u>=  j  vl'J'1 1(1“Fi>  1  Ci  • 


(1.32) 


Since  =  0  for  y^  <  a  and  G^  =  1  for  y^  >  &  +  0,  the  range 
of  integration  on  y^  can  be  reduced  to  a  <  y^  <  a  +  0,  and  we 


can  substitute 


F^  =  a  +  0G^  valid  for  a  <  y^  <  a  +  0  , 


in  (1.32).  This  yields 


*a0(u>  =  (£)  \,0  J  (a+ec/W^51  dG1  <  (1.33) 


«1 

Now  if  we  expand  (a  +  0  G^)  and  (b  -  0  G^)  and  interchange 
the  order  of  summation  and  integration,  we  get 


*1  h 


<Hq)v  i  ^w(-i,,‘Vitl' 


«i-v  £.-q  v+q 
a  b  0 


v=0  q=0 


j  °r  «i . 


(1.34) 


This  expression  can  be  integrated  and  simplified  to  yield 


*1  h 


q  °<i-v  5n-q  v+q 


*a0(u)  ‘  nll  40  ,oE  ^  [(-1)  a  1  b  L  0 

v=0  q=0  ^ 

V.  ql  ( v+q+1) ]  .  (1.35) 
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Likewise  in  the  case  n2  -  2,  (1.30)  becomes 

§i/2  v  a+0  y2 

^(u) = 2\  £  v  (vOj=.  • 

1  '  J  (1.36) 

where  the  range  of  the  y' s  has  been  reduced  to  a  <  y^  4  yj,  s  a+0 

since  G  =  0  for  y  <  a  and  G  =  1  for  y  >  a  +  0  .  Now  substitu- 

ing  Fj  =  a  +  0  Gj  valid  for  a  *  y^  <  a  +  0  (1.37) 

<*2 

in  (1.36)  and  expanding  the  terms  (a+0  G^)  *  and 
we  get 


^-eo^1 


?x/2 


*2  il 


Cpa0(u)  =  21  I40.,ol  I  (L,i 


'  5 » «  *-•  r»o  j 

i^O  v=0  q=0  \ 


q  oc-v  i,-q 
(-1)  a^  b1 


1  G 


5,  ^+9  -  -  2  V  $2  q  .  . 

•  J  J  G!  (02-  G!)  °2  *1  *2  .  (1*38) 


0  0 


Letting  Q  =  G^/G2>  the  innermost  integral  in  (1.38)  yields  a 
complete  Beta  function.  The  resulting  expression  when  integrated 
and  simplified  becomes 


V2 


*2  Li 


r~i  r-,  p  q  °<5-v  i,-q  S?+v+q 

*  v- 21  Iv,o  E  Ei<-»  •  6  8  1 


i^O  “•  v=0  q=0 


-1 


[qi  (i1-q)'.(<=<2_v)'.(52+v+l)'.(52+v+q+2)]' 


(1.39) 
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1.4.2  a  +  9  =  1. 

In  this  special  case  (1.41)  can  be  further  simplified  to 

yield 

V2  V3  \ 

tie*”'  =  V  °2*  E  Z  •••  Z  4e  ,0  E  E 

<1=°  ^  ‘o.-r0  02  »=o  1=° 


o<  -V  \  +V 

q  o,  n0 
[(-1)  a  ^  b  2 


]  [q'.(<*  n-q) '• 

2  °2  x 


( 1  { Xn2-in2-l+v+^+n2)  3"1 


(1.42) 


1.4.3  a  +  9  >  1  . 

This  case  must  be  further  subdivided  into  two  subcases, 
namely:  (i)  a  <  1  , 

(ii)  a  >  1  . 

1.4. 3.1  a  <  1. 

For  a  <  1,  the  range  of  integration  for  y  can  be  split 
into  four  parts,  namely:  (1)  -  »  <  y  <  a,  (2)  a  <_y  <  1, 

(3)  1  <  y  la  +  9,  (4)  a  +  9  <  y  <  •  .  Over  parts  (1)  and  (4) 

the  value  of  the  integral  is  zero  since  G  is  constant.  Hence,  we 
will  consider  only  the  ranges  (2)  and  (3). 
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For  the  case  of  =  1,  (1.30)  becomes 

1 


Oae<u)  =  11 


I  \,°GV?  (1Va -=1  **u,0  J  *1 
?!=*  \  /  J'^l 


at0 


(1.43) 


since  5^  -  u  and  u  =  0  implies  that  0^  =  0 
Consider  P^(u)  defined  by 


n,  \  <=(, 


p1(u)  =  (v)'1  ;  v(^?il(Wi)1*i  •  (1,44) 

Perform  the  following  substitution:  =  a  +  0  G^,  a  s  s  1  , 

Then  (1.44)  becomes 

pl<“>  *  'V-*'1  J  )  <a  +  9  Gl>°<1<b  -  9  =l)!l  *1  . 

°  '  '  (1.45) 

If  we  expand  both  binomials  and  integrate  the  resulting  expression, 
(1.45)  takes  the  form: 


*1  ?i 


Px(u) 


r  r>  q  °<n-v  5.+V+1 

{e  oi  a  b  ][e  vl  q>.  (c<rv)'. 

J*  _ _ A _ A 


v=0  q=0 


-1 


(?1-q)'.  (v+q+l)  ]”  .  (1.46) 
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If  we  define 


P0^  "  4u,o/nll  » 

then  since  G^(l)  =  b/0  ,  (1.43)  can  be  written  as 


(1.47) 


*ae(u)  nl*  1:  +  P0(u)(l  -  b/0)]  .  (1.48) 

If  ^  =  2,  the  expression  for  cpag(u)  can  be  broken  down 
into  three  integrals  as  follows: 


i  y„  V2 


<Pa9(u)  =  3! 


51  J  f 

\  7 


h.  % 

)  ^(l-F2)  c^dG^ 


a+0  1 


'n.  \  c<. 


21  j 

V1  *!=» 


a+9  y 

21  4«,0  J  J  <*>1  *2  U.49) 

y2=i  v1 


Note  that  when  i]£_1  =  0,  then  $k  =  ?k-1,  =  c^,  ^  =  \k_r 

^  ~  ®i(_i  ~  3,  •  ««>  n2)  .  Also  u  =  0  implies  that  0^  =  0  • 
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Let  ua  consider  the  first  of  the  above  three  integrals  by  defining 
1  ^2  /  \  <*  i 

P2(u)=<n1',)'1  J  J  E»8„0|(”1ijF*2<F2-ri)!2<1-F2)ildGldG2  . 

(1.50) 

(1.51) 

into  (1.50).  Also  we  can  expand  the  binomials  and  interchange  the 
order  of  summation  and  integration.  This  yields 


y2=a  yx=a  i,=0  \ 


We  can  substitute 


Fj  =  a  <•  8  ;  a  s  yi  s  1  , 


p2<») 


Sl/2  °^2  *1 

'"l'-l'1  l  l  W,0 


1  '2 


<*  -V  i,-q  52+v+q  v  § 

a  b  0  J  J  0,  (G^G,)  ^  dO,  d02  . 

y5=a  y,=a 

2  1  (1.52) 
This  expression  integrates  and  simplifies  to 

S-,/2  oc  i 

o  t-i  ri  q  ti  i  +V+2  2 

p2^u)  =  i  H  a  b  H0  q  •  Ux-q).  • 

i,=0  v=0  q=0 

(«2-v) '.  ( ?2+v+l) '.  ( ?2+v+q+2)  ] -1 


(1.53) 
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With  this  notation  (1.49)  can  be  written  as 

<PaQ(u)  =  njl  21  [P2(u)  +  P1(u)(l  -  b/0)  +  PQ(u)(l  -  b/e)2(.2t)“1]# 

(1.54) 

For  n2  =  3>  the  expression  for  <pa0(u)  becomes 


”.8tu) 


1  y2  V2  V3 


31  /  n  i  Ev  ,  ?i  v 

y3=a  y2=.  y^a  1^0  lg=0  3  ft’  V  V 


5-j  i,  io 

(F^)  j,(F3-F2)  X(l-F3)  **  <C2  dC3  + 


a+9  1  y  5l/2 


31  J*  I  I  1 6e2,o  Is 2/Li1 ) F? 

y3=l  y2=a  y1=a  i^O  \  *  / 


(F2-Fi) 


(1-F2)  x  dSj.  dG2  dG3  + 


a+9  y3  y2 

31  J  J  J  *u,0  *1  *2  *3  . 

y3=1  y2=1  yl-1  (1.55) 


/ 


/ 

/ 


/ 
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Now,  if  we  define 


1  ?2  V2  V3 


p3(»)  =  <y>_1  j  j  j  £  E».  0/l 

y^=a  y^a  y^=a  i^=0  ^  ^3*  11*  12i 


S  5,  i,  i- 

P1  (VF1)  (F3-F2J  v1*^)  *i  ^2  *3  * 


which  upon  integration  and  simplification,  becomes 

V2  °^3  i2 


P3(u)  = 


IEEE  L. •">"  >■>”’]  ■ 

1^=0  i2=0  v=0  q=0L  J 


[03  q'.  (<=<3-v) (i2-q)'.  (X3-i2+v+2)i  (X3-i2+v+q+3)  J"1  , 


(1.56) 


then  (1#55)  becomes 


<Pag(«)  =  V  31  I  (pj(u)  U-b/e)3”J[(3-J)'.]‘1]  . 

J=0  1  3 


(1.57) 


In  an  analogous  fashion  for  the  general  case,  we  can  define 


Pj(u)  and  obtain 


?l/2  5j-l/J  ij-l  •) 

v>-  l-  III  k  ,o<-»q  v  ]  • 

h=0  ^-i"0  v=0  q=0  L  3 


(1.58) 


[03  ql  (^-v)'.  (ij_1-q)'.(X3-iJ_1fv+J-l)i(Xj-iJ_1+v+q+j)r1j 
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for  j  >  2,  with  Pq(u),  P^(u)  as  previously  defined. 

Then 

”2  , 

«Pae(u)  =  n4l  njl  £  Jpj(u)  (l-b/e)112  [(aj-JJir1)  U-59) 
J=0  1 


1.4. 3-2  a  a  l 

In  the  case  ail,  the  results  are  trivial,  namely: 


a+9 

*.e(u>  =  ”2»  4u,o  J 


V 


"2  y2 

J  -j  ®i«a-*fa2='«u,0  . 

yn,-l=*  yl=* 

(1.60) 


Using  the  above  results  for  <PaQ(u),  the  power  of  the  U  test 
under  the  alternative  hypothesis  can  be  calculated  from 

u=< 

0  1  “«  I  «.,»  -  I  *,'<»)  .  (1.61) 

u=0 

where  u^  is  determined  by  the  level  of  significance,  =<f  from 
the  relation 

Pr{  U  £  u^  |  Hq}  s  « 

1.5  Results 

Tables  1.1  and  1.2  compare  the  powers  of  the  Mann-Whitney  U 
test  against  Mood's  median  test.  The  numerical  results  for  Mood's 
test  presented  in  Tables  1.1  and  1.2  were  taken  from  Leone., 
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Chakravarti  and  Alanen  [13].  In  Table  1.1,  the  exponential  alterna¬ 
tive  is  considered  for  various  values  of  the  location  parameter,  a, 
for  a  =  0(0. 1)1,  1.5,  2,  3>  for  sample  sizes  11  and  15.  It  should 

be  noted  that  when  the  location  parameter  is  zero,  we  get  the  null 
distribution  with  the  power  equal  to  the  level  of  significance,  o<. 
Since  the  distributions  of  the  test  statistics  are  discrete,  the 
values  of  <=<  do  not  in  general  coincide  for  both  the  tests.  Hence, 
although  many  different  cases  have  been  computed,  only  those  values 
that  are  relatively  close  together  and  which  indicate  the  general 
trend,  have  been  tabulated  in  Table  1.1.  The  conclusions  that  can 
be  drawn  from  this  table  (relative  to  the  exponential  alternative )  are: 

1)  If  n^  is  smaller  than  n^,  the  Mann-Whitney  test  is  more 
powerful  than  Mood's  test.  To  note  this  increase  of  power, 
several  cases  were  intentionally  chosen  where  the  level  of 
significance  for  the  Mann-Whitney  test  was  slightly  less 
than  that  of  the  Mood  test.  In  these  cases,  the  power  of 
Mann-Whitney' s  test  rapidly  overtakes  Mood's  test  as  the 
location  parameter,  a,  increases. 

2)  If  n^  is  larger  than  n^,  Mood's  test  is  more  powerful 
than  the  Mann-Whitney  test.  Likewise,  to  note  this  increase 
of  power,  several  cases  were  intentionally  chosen  where  the 
level  of  significance  for  Mood's  test  was  slightly  less  than 
that  of  the  Mann-Whitney  test.  In  these  cases,  the  power  of 
Mood' s  test  rapidly  overtakes  Mann-Whitney' s  test  as  a 
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Increases. 

3)  In  those  cases  in  which  n^»  the  two  test  procedures 
seem  to  exhibit  powers  that  are  approximately  the  same*. 

The  rectangular  alternative  for  the  special  case  in  which 
9  =  1  -  a,  is  considered  in  Table  1.2.  The  values  of  the  parameter, 
a,  range  between  0.0  and  0.9  with  increments  of  0.1  (where  the 
value  of  0*0  indicates  the  level  of  significance,  <*,  of  the  test 
under  the  null  hypothesis).  The  total  sample  sizes  chosen  are  again, 
11  and  15*  As  in  the  case  of  the  exponential  alternatives, the  levels 
of  significance  do  not  in  general  coincide,  since  the  distributions 
are  discrete,  but  in  those  cases  in  which  the  levels  are  relatively 
close  together,  the  results  indicate  that  the  conclusions  drawn  from 
the  exponential  data  continue  to  hold  in  the  rectangular  case. 

In  both  of  these  tables,  a  is  non-negative.  If  the  alterna¬ 
tive  hypothesis  were  for  a  <  0,  the  same  situation  would  hold. 

That  is,  if  n^  >  n the  Mood  test  would  exhibit  more  power,  while 
the  Mann-Whitney  test  woui-  be  more  powerful  for  n^  <  n^* 

These  results  (that  is,  with  respect  to  the  exponential  and 
rectangular  alternatives)  indicate  that  in  those  cases  when  n^>n^, 
it  is  preferable  to  use  Mood’s  median  test  over  the  Mann-Whitney  U 
test.  A  further  advantage  in  the  case  of  Mood’s  test  is  that  the 
experiment  needs  to  be  run  only  until  the  median  of  the  combined 
sample  has  been  observed.  In  many  experiments,  this  fact  gives  rise 
to  a  reduction  in  the  cost,  due  to  savings  in  time,  experimental 
material,  availability  of  equipment,  and  the  like. 
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1*6  Further  Extensions. 

First,  the  results  in  Tables  I  and  II  should  be  extended  to 
larger  sample  sizes  to  see  if  the  previous  results  still  hold.  This 
extension  will  also  indicate  how  rapidly  the  results  approach  the 
asymptotic  situation,  and  the  complete  tables  can  be  used  to  de¬ 
termine  the  sample  size  required  to  obtain  a  given  power. 

Second,  tables  similar  to  Tables  I  and  II  should  be  com¬ 
puted  for  comparing  the  Mann-Whitney  U  test  with  Massey1 s  two 
sample  test. 

Third,  exponential  alternatives  with  a  change  in  the  scale 
parameter  should  be  considered.  Power  functions  for  these  alterna¬ 
tives  can  be  developed  for  Mood’s,  Massey1 s  and  Mann-Whitney’ s  two 
sample  tests,  and  tables  comparing  these  results  can  be  computed. 

Fourth,  an  attempt  should  be  made  to  analytically  compare 
the  power  functions  of  Mood’s,  Massey’s  and  Mann-Whitney* s  tests 
independent  of  the  computational  results  to  see  if  the  same  conclu¬ 
sions  are  indicated. 

Fifth,  an  attempt  should  be  made  to  develop  a  class  of 
functions  to  which  the  results  of  this  chapter  can  be  applied. 

Sixth,  the  power  function  for  the  Mann-Whitney  U  test 
can  be  extended  to  the  case  of  c  samples  for  whatever  tests  are 
developed  for  the  c  sample  case. 
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CHAPTER  II 


ASYMPTOTIC  RELATIVE  EFFICIENCY  OF  THE  MANN-WHITNEY  U  TEST  JBAINST  AN 

EXPONENTIAL  ALTERNATIVE 


2.1  Introduction. 

In  Chapter  1,  the  exact  power  of  the  Mann -Whitney  two  sample 
U  test  for  discriminating  between  two  populations  was  derived.  Two 
types  of  alternatives  were  considered;  namely,  a  change  in  location 
of  an  exponential  population  and  a  change  in  location  and  scale  of  a 
rectangular  population. 

The  asymptotic  relative  efficiency  of  the  Mann-Whitney  U  test 
against  an  alternative  of  a  change  in  location  of  a  normal  population 
was  shown  to  be  3/n  [16],  [  1  ].  The  asymptotic  relative  efficiencies 
of  Mood's  test  based  on  the  median,  and  Massey's  test  based  on  the 
first  quartile  and  the  median,  when  compared  against  the  likelihood 
ratio  test  appropriate  for  detecting  a  shift  in  location  of  an  expon¬ 
ential  population,  were  found  to  be  zero  by  Chakravarti,  Leone,  and 
Alanen  [  3 ] . 

In  this  chapter  the  asymptotic  relative  efficiencies,  of  the 
Mann-Whitney  U  test,  when  compared  with  the  likelihood  ratio  test, 
Mood's  and  Massey's  tests  for  detecting  a  shift  in  location  of  an 
exponential  population,  are  considered. 


30 


2.2  Limiting  Distribution  of  the  Mann-Whitney  U  Statistic. 


Let  us  define  the  statistic  V  by 

nl’n2 


'n^Hj  =  . 


(2.1) 


(2.2) 


Var(Vn1,n2)  =  Var(U)  t^r2 


It  has  been  shown  by  Lehmann,  [ll]  and  [12],  that 


”2* 


(2.4) 


has  an  asymptotic  normal  distribution,  provided  that  as  n^o^  ^  *  > 
(aj/n^)  -»  constant  <  ®  .  (2.5) 


Furthermore,  Mann  and  Whitney  [14]  have  shown  that 


V  )  =  ( 

n1,n2/  J 


=  G  dF  , 


(2.6) 


n^VartV  )  =  [  (n1+n2+l)/l2]+[  (n^l)  (X-^)  J+C^DCX-^)]- 


[(n1+n2-l)XZ]  ,  (2.7) 
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where 

X  =  $  -  J  G  dF,  «x=  (1/3)  -  J  G2  dF,  and  *2  =  (]/5)-J*(  l-Fl2dG 


Thus,  the  expressions  for  the  mean  and  variance  of  U  can  be  writtai 
43  E(U)  =  n^  Jg  dF  ,  (2.8) 

Var(U)  =  (n1n2){((n1+n2tl)/l2]+(n1-l)(X-c1)+(n2-l)U-«2)  - 


(nl+n2-l)X2}  .  (2.9) 

With  the  exponential  alternative  considered  in  Chapter  1  (See  Equa¬ 
tion  (1.15)),  (2.8)  and  (2.9)  become 

E(U)  =  (n^/2)  .  *  (  (2.J.0) 

and 

Var(u)  =  (n1n2/l2)[(n1+p2+l)+2(n1-l) (l-e"a)+2(n2-l)(l-e_a)(l-2e~a)- 

3(n1+n2-l)(l-e  a)2]  .  (2.11) 


2.3  Asymptotic  Relative  Efficiency  of  the  Mann-Whitney  U  Test, 

Let  0,  the  parameter  of  interest,  label  the  sequence  of 

distributions.  Consider  the  null  hypothesis  H  :  0  =  0  and  the 

oo 

sequence  of  alternatives  :  0  =  0  +  dmr  for  some  positive  r 

m  o 

and  d.  Let  N(6)  and  N*(6)  be  respectively  the  sample  sizes  re¬ 
quired  by  two  test  procedures  t  and  t*,  to  achieve  the  same  power 
(1-0)  at  the  same  level  of  significance  <=<  f  where  6  is  the  dif¬ 
ference  0^-0^  •  Then,  the  asymptotic  efficiency  of  t*  relative  to 
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t  is  defined  as: 

Eff(T7r)  =  Lim  [N(«)/H*(«)] 

64O  *  '2,u 

Since  for  the  Mann-Whitney  U  test, 

Llm  P0  [[(njJ^U/n^ -E(U/n1n2))][n1n5Var(U/n1n_)]"^<  x}  = 
nl,n2**  n2  *  X  *  1  4  ~ 

,  x  2 

=  #(x)  =  (2tt)~*  J  e"*6  dt  ,  (2.13) 

.  OB 

where  0  =  dnp~^  and  0  =  0,  the  following  theorem  due  to 

**  o 

Hoeffding  and  Rosenblatt  [  9 ]  can  be  applied: 

Theorem:  If  for  a  sequence  of  test  procedures  ft  },  where  t  is 
based  on  a  random  sample  of  size  n,  the  following  regularity 
conditions  hold: 

a)  0_(  9  )  <<*,  Lim  B  (0  )  =  c<,  where  B  (9)  is  the 

n  0  “  n-)»  n  0  n 

probability  of  rejecting  the  null  hypothesis, 

b)  There  exists  a  positive  r  and  normalizing  functions 

( 9 )  and  o(9)  such  that  for  any  real  x  and  any 

d  >  0, 

l  x  x  2 

Lim  P.  {nr[(t  -  n(0  ))/o(0  )]<  x}=«(x)=(2n)“*  f dt  , 
n  ■*  «  n  n  n  n~ 
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vhare  0  =  0  +  dn  , 

no 

c)  n ( 0)  has  a  derivative  ^'(0^)  at  0q  and  ^'(9^)  >  0, 

d)  a(0)  is  continuous  and  positive  at  0  =  0q  , 
then 

Lim  @n(0o+dR_r)  =  f[d  n'(0  )/o(6  )  -  Xj  ,  (2.14) 

n  ® 

where  #(-Xo()  =  <X .  The  efficiency  index  N(6)  of  the  test 
based  on  tR  ha^  the  expression 

N(ft)  =[(Xc(H9)  <t(0c)/(6  iA'(0o))]1//r  .  (2.15) 

The  Mann-Whitney  test  procedure  with  0q  =  0,  satisfies  all 
of  the  hypotheses  of  Hoeffding  and  Rosenblatt's  theorem  with  the 
exception  of  part  (c),  u  ( 0  )  >  0  *  In  this  case 

n’  (®0)  =  -  (n1n2/2)e  0 

which  for  0  =0  becomes 

o 

(0)  =  -  (n^n2/2)  <  0 

However,  the  restriction  that  M»,(9Q)  >  0  is  not  necessary  in  the 
case  of  r  =  since  Pitman’s  original  result  for  r  =  £  does  not 
require  this  restriction  on  n  (9q),  [IS]  •  Since  for  the  Mann- 
Whitney  test  procedure  r  =  we  can  ignore  the  restriction  on 
l±’(0o)  and  apply  the  above  theorem  with  9q  “  0  .  This  yields 

Ni(fi)  =  {(Xi<U0)[n1n2Var(U/n1n2)o]i[6E,(U/n1n2)o]-1]2 


(2.16) 
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where  E  (U/n^r^)  denotes  differentiation  with  respect  to  0  and 
the  subscript  q  means  evaluate  at  0=0. 

Now,  for  the  exponential  alternatives  with  a  shift  in  the 
location  parameter,  (2.16)  becomes 

Nl(6)  =  [(XQ<Hg)2((n1+n2+l)/l2)^  6"1]2 


=  [(Xc<Hp)2(n1+n2+l)(3  62)’1]  #  (2/ 

Similar  results  have  been  derived  for  Mood*s  and  Massey* s 
tests,  and  the  likelihood  ratio  test  by  Chakravarti,  Leone,  and 
Alanen  [  3 ] •  Their  results  are  summarized  below: 


For  Mood*s  test  procedure  based  on  the  median 
N2(6)  =  [(\c<H9)2(n1+n2)(n2  62)*1] 


(2.18) 


For  Mood' s  test  procedure  based  on  the  first  quartile 

N^(6)  =  [(X^+Xg)  (n1+n2)(3n2  62)  1]  .  (2.19) 

For  Massey' s  test  procedure  based  on  the  first  quartile  and  median 

N4(«)  =  [(nj+Oj)  A2  (3n2  «2)-1]  ,  (2.20) 

2 

where  A  is  a  solution  of  the  equation 


J  f(x2,  A2)  dX2  =  1  -  8  , 

m 


(2.21) 
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and  f  is  the  non-central  chi-square  density  function  with  two 
degrees  of  freedom. 

For  the  Likelihood  Ratio  test  procedure 

N*(6)  =  D*/6  ,  (2.22) 

where  D*  denotes  the  solution  of  H(D*)  =  1-0, 

and  H(d)  =  lim  0  (0  +dn~r) 
n  +  m  n  0 

It  is  easily  seen  that  the  asymptotic  efficiency  of  all  of  the  above 
tests  relative  to  the  likelihood  ratio  test  is  zero,  since 

N#(6)/Ni(6)  •»  0  as  6  *  0  for  i  =  1,  2,  3,  4  .  (2.23) 

Likewise,  the  asymptotic  efficiency  of  the  Mann-Whitney  test  relative 
to  the  median  test  is 

Eff(T1,T2)  =  N2(5)/N1(6)  =  3(n1+n2)[n2(n1+n2+l)]^1»  3/a^ 

(2.24) 

The  asymptotic  efficiency  of  the  Mann-Whitney  test  relative  to  the 
test  based  on  the  first  quartile  is 

Eff(Ti,T3)  =  N^(6)/N1(a)  =  (n^+n2) [n2(n1+n2fl) n 2-1 

(2.25) 

Also,  the  asymptotic  efficiency  of  the  Mann-Whitney  test  relative  to 
Massey*  s  test  based  on  the  first  quartile  and  the  median  is 

Eff(r1,T4)  =  N4(6)/N1(6)  =  (n1+n2)A2[n2(n1+n2+l)(\o(nB)2]"1 

»  ‘2  ["2<VS)Jrl  . 


(2.26) 
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2.4  Further  Extensions, 

First,  the  results  given  in  the  above  equations  can  be 
tabulated  for  various  sample  sizes,  and  a  comparison  can  be  made 
to  determine  the  asymptotically  most  efficient  test  procedure 
among  those  considered. 

Second,  the  above  tests  can  be  compared  with  the  standard 
t  test  for  detecting  a  shift  in  the  location  parameter,  when 
the  usual  assumption  of  normality  is  made. 


CHAPTER  III 


EXACT  POWER  OF  SOME  TESTS  BASED  ON  A  GENERALIZATION  OF  MOOD'S 

STATISTIC 


3.1  Introduction . 

In  many  practical  situations,  such  as  life  testing,  the 
sample  observations  arise  in  order  of  their  magnitude,  so  that  the 
first  observation  is  always  the  smallest,  the  second  observa¬ 
tion  is  second  smallest,  and  so  on.  To  discriminate  between 

two  populations  on  the  basis  of  two  such  ordered  samples,  many  rank 
tests  are  available.  Of  them,  Mood's  test  [16]  based  on  the  median 
of  the  combined  samples  and  Massey's  extension  of  Mood*  s  test  [15] 
based  on  fractiles,  have  much  to  commend  themselves  as  quick  tests. 
The  exact  power  of  these  tests  against  the  alternatives  of  exponen¬ 
tial  and  rectangular  populations  for  the  case  of  two  populations  has 
been  investigated  in  detail  by  Chakravarti,  Leone,  and  Alanen  [13]. 
The  purpose  of  this  investigation  is  to  extend  the  results  available 
for  Mood' s  two  sample  test  to  the  case  of  discriminating  among  c 
populations  on  the  basis  of  c  ordered  samples.  The  corresponding 
extension  of  Massey's  test  is  investigated  in  a  subsequent  chapter. 

3.2  The  c  Sample  Problem . 

Let  •[  X1^i;,  X2(i),  Xn  }  for  i  =  1,  2,  c  be 

c  sets  of  independently  distributed  random  variables  with  continuous 
cumulative  distribution  functions  F^,  respectively.  We  wish  to 
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test  the  hypothesis 

Hq  :  Fx(x)  =  F2(x)  =  ...  =  Fo(x)  , 
against  the  alternative 

Ha  :  Fi(x)  >  f2^x)  >  •••  >  FCU)  • 

c 

Denote  the  size  of  the  combined  sample  by  n  =  2  n.  and 

i=l  1 

for  the  sake  of  simplicity  assume  that  n  =  2r  +  1,  where  r  is  an 

integer.  Let  Z^  <  <  ...  <  Z^  be  the  ordered  combined 

sample.  Z  =  Z^r+^  denotes  the  median  of  the  combined  sample,  and 

t  h 

denotes  the  number  of  observations  in  the  i  sample  less 
than  Z (i=l,  2,  c). 

Thus,  the  observations  can  be  arranged  to  form  a  2  by  c 
contingency  table  as  follows: 


Number  in  Sample  below  and  above  Median- 


Category 

l3*'  Sample 

2nd  Sample 

*  •  • 

th  _  . 

c  Sample 

Total 

Less  than  Z 

ui 

U2 

■ 

uc 

a 

Greater  than 
equal  to  Z 

or 

ni  -ui 

n2  -  u2 

■ 

nc  "  uc 

Total 

ni 

n2 

n 

n 

c 

n 

Subject  to  restrictions 


c 

Z  n.  =  n 
i=l 


c 

Z  u,  =  r  ; 
i=l  1 


(3.1) 
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We  can  define  a  statistic  T  by 


c  % 

T  =  E  |(ui“rni/n)2(rni/n)"1  + 

1=1  (3.2) 

Then  a  test  of  size  o(  based  on  the  statistic  T  is  as  follows: 


reject  H  if  Tit  , 


accept  H  if  T  <  t  , 


where  t,  is  defined  by  Pr(  T  i  t  I  H  ]  £  <* 

This  problem  is  equivalent  to  testing  a  2  by  c  contingency 
table  with  fixed  marginal  sums  for  independence  between  columns. 


3-3  The  Null  Distribution . 

We  need  to  develop  an  expression  for  the  density  function 
h(up  uc,  z)  of  U^,  (i  =  1,  2,  c),  and  Z.  Let  us 

assume  that  the  median  Z  is  from  sample  j,  then  the  probability 
?j(ur  uc,  z  )  of  obtaining  these  values  in  the  contingency 

table  is  given  by 


Vv 


“o’ 


z)  =  (n 


'Uj)K 


k/  tt 

l  i=l 


[(Fi(z)) 


i,  n.-u, 

^l-F^z))  1  L] 


}• 


[l-Fj(z) ]-1 


dFj(z) 

dz 


(3.3) 
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From  Equation  •  (3.3),  we  obtain 

c 

b(u^>  •••>  ^  Fj  (n-^,  »•«,  ^g»  2) 

j=l 


jn 

(3.4) 

The  null  distribution  ^(u^  ...»  uc)  of  UJL,  (i  =  1,  2,  c), 

under  the  null  hypothesis  is  derived  from  h(u, ,  . . . ,  u  ,  z)  by 

X  c 

substituting 


c  u.  n,-u.  <* 

‘DiwirV. 


F^z)  -  F2(z)  =  ...  =  Fc(z)  =  F(z) 


in  (3*4)  and  integrating  the  resulting  expression  over  the  range 
of  z.  This  yields 


90  (ul’ 


1 

...»  u  )  =  K(r+1)  f  F1-  (1-F)r  dF 
C  J0 


(3.5) 


This  result  is  in  agreement  with  that  obtained  by 


Chakravarti,  Leone,  and  Alanen  [13]. 
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3*4  Power  Function  of  Median  T  Test  Against  the  Alternative  of 
Translation  in  the  Exponential  Population. 

Here  the  alternative  hypothesis  considered  is: 

- (x-a . ) 

Fi(x)  =  1  -  e  ,  x  >  , 

=  0  ,  x  <  , 

where  ai+1  >  (i  =  1,  2,  . ..,  c-1),  a1  >  0  . 

(3.6) 

The  only  necessary  requirement  is  that  a  >  a.  (i  =  1,  2,  . c-1). 

The  joint  distribution  of  the  U^’s  is  obtained  by  sub¬ 
stituting  (3-6)  in  the  expression  for  the  joint  density  function 
h(u^,  ...)  uc>z)  given  by  (3-4)  and  integrating  over  the  range  of 
2.  The  range  of  z  can  be  reduced  to  a^  <  z  <  •  ,  since  all  of 
the  F^'s  are  zero  for  z  <  a^  .  This  gives  us 

Pr{U^  -  u^  |  i  s  1|  2|  **m  c]  =  9^  ^ui*  *  *  *  *  uc) 


=E(v“3,l!  Jl1itFi(2)Ul 

j=l  z=a11"i 


(1-Fi  (z)  )VUi]i  [l-Fj  (z)]\ (z) ' 

(3.7) 


Now,  we  can  write 
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and  obtain 

c 

cpa(u^*  •  ••,  uj  =  S^(u)  ,  (3*8) 

t=l 

where 

C  *t+l  c  u  n  - 

st(u)  =  £  (nj-Uj)K  J  {  n-  [F^ll-Fj)"1  ^Itl-Fj]-1  dFj 

J=1  «l  1=1  '  0.9) 

Since  F^  =  0  for  t+1  <  i  <  c,  S^(u)  will  be  zero  unless  =  0. 

Also,  all  terms  for  j  >  t  will  be  zero,  since  F^  =  0*  Thus,  if 
we  define  uc+^  3  0,  (3.9)  reduces  to 


*t+l 


St(u) 


t  u,  n.  -u , 


l  (vVK( ,H .V n,o>  J  f.3 1^i1<i-f1)  1  ln 

j=l  z=a. 


[i-Fj]-1  dFj  , 


(3.10) 


where  6^  j  is  the  Kronecker  delta. 


Let 


\,c  e 


-(at-ai) 


Yi,t=1“V  . 


(3.11) 


k 
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Then  we  can  substitute 

Fi  =  1  ~  =  Yi,t  +  \,tFt’  for  at  -  z  -  at+l’  1  t> 

(3.12) 

in  (3.10).  This  gives  us 


F+>*.*i) 


St(u) 


r-i  C  t  t+1  t  U, 

L  (nj'uj)K(  T  4u  o}  J  t  T  t(1"Ft^ 

^  3  3  i=t  ui+l’°  JQ  i=l  i»t  t 


2  ( n., -u^ ) -1 


r  t  n  -u  1  ~  '“i 

(iH111i,t  )[l“Ft]  “t  .  (3.13) 

Expanding  [1-T|^  ^(1-F^)]  by  the  binomial  theorem,  we  get: 


i. 


t 

(1-Fji=1  dF 


t* 


(3.14) 
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This  expression  can  be  integrated  to  yield: 


■  t  I  U1  ut 

St(u)=K  [i.  .I  l„.  E{jt(-l)qi- 

j=l  11  Jq.=0  q+=0U  1 

A  W 


'ni-ui^i!  (S \A  [  i=i(ni‘Ui+qi)|  [  l 

J  v qi y J /  L  ^>t+i  jL= 


[  2  (ni-u^)] 

L  i=l  J  • 


(3.15) 


For  the  case  in  which  t  =  1,  (3*10)  reduces  to 


S-jU) 


"c  I  a2  u,  tt.-u.-l 

=  <V“1>K  J  Fi  <1-ri>  dFi  -  (3-16> 

•  -j  Z-fl. 


which  can  be  integrated  to  yield 


nl~ul  \ 

c  |  r-i  q-i  u.+q-.+i  _i 

si(u)  =k  I  (-1)  !  I  y.(n:rui”qi^vi,2  t'W1!  B 

(3.17) 

Consider  the  special  case  ir.  which  a^  =  =  ...  =  ac  ^  =  0 

and  a  =  a  >  0.  Let 
c 

V\c=",  =  W1:e"a'  Fl=F2  =  -”=Fc-l=F’  Fc=G’ 


U  =  2  u  , 

C  i=l  1 


N  =  2  n,  , 
C  i=l  1 


(3.18) 
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then  (3*7)  becomes 


•  U  N  -U  -1  u  n  -u 

<Pa(Uc,uJ  =  (Nc-Uc)K  J  F  °(1-F)  c  c  G  c(1-G)  c  °  d F  + 

z=0 


•  U  N  -U  u  n  -u  -1 
(no-uc)K  J  F  (1-F)  G  °(1-G)  dG  . 

z=0 

(3.19) 

This  can  be  written  as 


f.'W  =  <Vr>K‘u  ,0 

c 


p 

z=0 


N  -r-1 

( 1-F)  c  dF 


•  U  N  -U  -1  u  n  -u 

(N  -UjK  f  F  (1-F)  C  C  G  °  (1-G)  °  C  dF  + 

C  C  J 

z=a 


U  N  -U  u  n  -u  -1 

c  c 


(n  -u  )K  f  F  C(l-F)  c  CG  C(l-G)  c  C  dG 

C  C  v 


z=a 


(3.20) 

Substitute  F  =  l-T)(i-G)  for  a  <  z  <  ®  ,  in  (3.20)  and  obtain 


N  -r 
c 


qn  /N  -r\  __ 


*a(VV  =  K  6uc,0  I  \qj  J/  1  dF  + 

q^o 


ql=0  q2=0 


(3.21) 
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which  upon  integration  yields 


P  q-i  [N  -r\  r+q,+l  , 

1>a<0o-uo>  *  K  s»  ,0  L1-1’  lqJ(N="~!l)Y 

°  q^o  N  1 


Ucr*>l 

K(rtl)  £  £  (-1) 

qi=°  q2=0 


qn  %  /  U„  \  Ith*  1  \  -1 

WJW)  'w« 


(3.22) 


If  c  =  2,  this  result  agrees  with  that  obtained  by  Leone, 
Chakravarti,  and  Alanen  [13]. 

These  results  can  be  used  to  compute  the  power  of  the  test 
for  the  case  of  the  exponential  alternative  by  first  defining  t^  by 

Pr{  T  >  |  Hq  )  >  «(  , 

where  the  probability  is  evaluated  using  (3.5).  The  power  is  calcu¬ 
lated  from  Pr[  I  >  t  |  H  }  =  2  tp  (u.  ,  . . . ,  u  ) ,  such  that 

ui 

T>  v 

3*5  Power  Function  of  the  Median  T  Test  Against  Alternatives  of 


Change  in  Location  and  Scale  of  the  Rectangular  Population. 


In  this  case,  two  sets  of  alternative  hypotheses  will  be 
considered,  namely:  one  in  which  the  location  parameter  changes  and 


another  in  which  the  scale  parameter  varies. 
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3*5.1  Change  in  Location  of  the  Rectai  oilar  Population. 

ai  <  x  <  1  +  ^  , 

x  <  ai  , 

x  >  1  +  a£  ,  (3*23) 

where  again  the  populations  are  ordered  according  to  their  location 
parameter: 

0  ^  a.  ^  a_  K, .  • .  ^  a 

X  C  • 

The  joint  distribution  of  the  U^fs  is  obtained  by  substi¬ 
tuting  (3.23)  for  Fi  in  the  expression  for  the  joint  density 
function  h(un,  .  uc,  z)  given  in  (3*4)  and  integrating  with  re¬ 
spect  to  z*  The  range  of  z  can  be  reduced  to  a,  <  z  <  1  +  a  , 

1  -  —  c 

since  all  of  the  F. 1 s  are  zero  for  z  <  a,  and  one  for  z  >  1  +  a  . 

1  1  c 

This  yields 

Pr{  “  ui  I  i  “  1>  2,  ...,  c  }  =  cpa(u^,  •  tt^) 


The  alternative  hypothesis  is 


F^(x)  =  x  -  a^  for 


=  0 
=  1 


for 

for 


1  Ui]}[l-Fj(z)]‘1dFj(z). 

(3.24) 


The  method  used  to  evaluate  the  integral  in  (3.24)  depends  upon  the 
relative  sizes  of  the  a^' s.  In  this  section  we  will  consider  only 
one  situation,  that  is  perhaps  the  most  interesting.  However,  any 
other  possible  situations  can  be  handled  in  an  analogous  fashion. 
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For  the  developments  in  this  section  we  will  assume  that 

ac  <  a!  +  1  ,  (3.25) 

Then  the  integral  in  (3.24)  can  be  partitioned  by  dividing  the  range 
of  z  into  2c-l  pieces.  This  yields 

c— 1 

*a(V  •••»  uc>  *  I  t  Pt(u)  +  Rt(u)  }  +  Qc(u)  ,  (3.26) 

t=l 


where 


c  "t+1  q  r  -q  n  u  ”1  1 

(u)  =  l  (vVK  I  , 

j=l  z=a.  ^ 


(3.27) 


c  1+Vl 


(u)  =  £  (n.-u  )K  J  I  IT  [f .i(l-F  )”*  Uil][l-F  r1  dF  (z) 
j=l  ^  '  z=l+a>=lL  JJ  '  ' 


(3.28) 


c  1+al 


Qc(u)  =  {  j  [f"1  U-F1)'’1'UlJ}(l-Fj]-:L 

j=l  z=a  1 


dF  (z) 

J  > 

(3.29) 


Consider  Pt(u)  .  Since  Fi  =0  for  1  +  t  <  i  <  c,  Pju) 
will  be  zero  unless  ui  =  0.  Further, all  terms  for  j  >  t  will  be 
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zero.  Thus  for  1  <  t  <  c,  (3.27)  reduces  to 


t  at+l  -* 

><u)  1  K(i=?nV)  (1Vrt] }  • 


Let 


“i.j =  arai  * 


z=at 

-1 

(i-Fj)  dFj  _ 

(3.30) 

Vi,j  “  ^“i.J  ’ 

(3.3D 

then  we  can  substitute 

Fi=tii,t+Ft’  X-Fi=  vi,t-  Ft '»  for  \  < z  <  atfi  >  1  <*•<*, 

(3.32) 

in  (3*30)  and  expand  the  resulting  binomials.  This  give  us 


U1  ut-l  nl-ul  nj-l-uj-l 


Pt(u)  = 


K(j£lv°)j £  •"  £  £  •••  £ 

\  /  J-i  q1=o  qt_1=o  v1=o  v .  ^=o 


nj-i-uj-i\  /VvVvi-v^ 


j-i 


vj  A  Vi 


F  (a  )  t-1  \ 

V  t+l;  U.  +  2  (qi+vi)  +V. 


nj-u-v 

vi,t  J]  J, 


i=l 


dF 


t  . 


(3.33) 
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By  noting  that 

(n 

(3.33)  can  be  integrated  and  simplified.  This  yields 


(vurvj> 


> ata  W  a“i,j  ■ 


Pt(u)  =  K 


Vi  VU1  nt-ut  l 

2  v. 


(^UZ-I  Z-I.v-1 

\  /  qx=0  qt_j=0  v1=0  vt=0 


•  t-1 

n 

l  i=l 


Vut\ft-1rui^i  ni-vvil\  . 

V  Vt/ li=lLi,t  Vi,t  Jj 


{ Z  lOj-n’/vj.tik! 

l3=i  J 


t-i 

2 

i=] 

t+1 


t-i  , 

[ut+^(qi+vi)+vt+1]"1> 

(3.34) 

for  t  =  2,  3»  ...»  c-1  .  P^(u)  is  easy  to  handle  and  is  given  by 


(-l)v(n1-r-v)u 


r+v+1 

1,2 


[r+v+1] 


-1 


(3.35) 


Similarly  (3.28)  can  be  simplified  if  we  note  that  =  1  for 
i  <  t  and  hence  (3.28)  will  be  zero  unless  (n^-u^)  =  0  for  i  <  t  . 
Also,  all  terms  for  j  <  t  will  be  zero,  and  for  1  <  t  <  c-2,  (3.23) 


will  reduce  to 
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Rf(u) 


c  1+at+l  . 


tl-Fj]-1 


If  we  substitute 


Fi  =  ^i,t+l  +  Ft+1  *  1-Fi  =  vi , t+l  ~Ft+l  * 


(3.36) 


(3.37) 


for  1+a^.  <  z  <  1+a.^,  t+l  <  i  <  c,  in  (3-36)  and  expand  the 
binomials,  we  get: 


t+2  uc  Vl^t+l  Vuc  t 


Rt(u)  =  K(^an  -u  <d)  t  -I  I  -  £> Di=t+li 

^t+2=0  V°  vt+l=° 


l\f  £  fV%  WVH 

JliiaKttlVi»tfl  JI 


C 

l  ("rurvj)/vj, 

^j=t+l 


t+l 


J  F 


1  ut+l+iJ+2(<’i+vi)+vt+l 


t+l 


dFk 


*t,t+l 


'  t+l  , 

(3.38) 

since  =  Fj(l+a^)  •  This  expression  can  be  integrated  to  yield 


N  o 
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In  a  similar  fashion  using  (3.31)  and  (3.32),  (3.29)  can  be 
integrated  and  rsimplified  to  give: 
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and 


"1  Vl  "l-“l  V»0  IT. 

«.w-*E-E  l  ...l  <-1)1=1 

’l”0  Vi*°  V°  V° 


V 


%C 


1=1  *  *  -  J 


-1 


lj=l  J  (3.41) 


If  we  consider  the  special  case  in  which  a^  =  . ..  -  aQ_^  =0 
=  a,  where  0  <  a  <  1,  and  let 


*l,c  =  ^2,c  = 

**•  “  *c-l,c  =  ^  -  a  » 

F1  =  F2  =  ... 

-*  Fc_x  =  t  »  Fc  =  G  . 

c-1 

C-1 

“c =  Eui  ■ 

i=l 

»==  E-i  - 

i=l 

(3.42) 

j 
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then  (3.24)  can  be  written  in  the  form: 


*A'uc>  *  K(Vr)8u  ,0  j"Fr  r  1  ®  * 

°  2=0 


_  _  A  U  N  -U  -1  u  n  -u 

K(N  -U  )  r  F  C(l-F)  C  C  G  C(l-G)  c  c  dF  + 

C  C  J 

z=a 


„1  U  N  -U  u  n  -u  -1 
K(n  -u  )  f  F  C(l-F)  C  C  G  C(l-G)  C  C  dG  + 

C  C  J 

z=a 


K(”c-“c>  S»-U  ,0 


n  -u  -1 
c  c 


(3.43) 


Now  following  a  procedure  analogous  to  that  used  in  the  general 
case  in  which  we  substitute  for  F ,  expand  and  integrate,  we 


get: 
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Vr 


*a(Vuc> 


v,/  N  -r\  r+", +1 


=  K6. 


v° 


uc,oZ<-»\V>  1 

c  ..  -n  '  *  ' 


V,  +V, 


U  N  -U  n  -u 
c  c  c  c  c 

«E  E  E <-«'x 

^V°  v2=0 


c^l 


N  -U  +u  +q,+v_+l  _ 

(1-a)  [uc4q1^1+v2+l]-1[(Nc-Uc-v1)/(l-a)  + 


n  -u 
c  c 


(n 


-vV’  *  k‘»  -i  ,0  E  (-yTf 'Jtvv'ii  • 

c  c  -n  \  1  / 


V° 


u  +v,  +1  , 

[1  -(1-a)  C  1  ][ucfv1+l l"1 


(3.44) 


3*5.2  Change  in  Scale  of  the  Rectangular  Population. 
The  alternative  hypothesis  is 

(  F1(x)  =  x/ei  ,  0  <  x  <  0i  , 

=  0  ,  x  <  0  , 


=  1  >  x  >  0. 


(3.45) 


7 
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where  the  scale  parameters  are  ordered  as  follows: 

o  <  e,  <  e0  <  ...  <  e  . 

The  joint  distribution  of  the  U^*s  is  obtained  by  substi¬ 
tuting  (3*45)  in  the  expression  for  h(u^,  uc>  z)  given  by 

(3*4)  and  integrating  over  the  range  of  z.  The  actual  range  in 
this  case  is  0  <  z  <  0  ,  since  all  of  the  F. '  s  are  zero  for 

“  "  C  X 

z  <  0  and  one  for  z  >  0  .  Thus  we  obtain: 

c 

Fr{  Ui  -  u^  |  i  -  If  2,  •••t  c  }  —  cPq(u^i  •••»  ^c) 


•  i  <  vv*  jc{I5iF‘1(i'Fi)"1"UiJ  1  [i'F)rl  dFJ  • 

J=1  2=011  ± 

(3.46) 

The  range  of  integration  on  z  can  be  broken  into  c  pieces.  This 
will  yield  c  integrals,  and  (3*46)  can  be  written  as: 


c-1 


«Pe(u1,  uc)  =  Qc(u)  +  ^Rt(u)  , 

t=l 


(3.47) 


where 


ft  U  , 

ui  n  -u  ] 

i  1-FJ  1  1 

J4i=iL 

1  1  J 

[1-Fjr  dFj  » 


c  °t+lr  -  _  • 

EtU)  7("j-“j>K  I  (  j  [Fi1a-F1)”1'Ui_ 
j=l  z=9/ 


(3.43) 


'^Fjl  dFi  . 

(3.49) 


57 


If  we  let 

=  V9i  ’  (3.50) 

then  (3.48)  can  be  simplified  if  we  substitute 

Fi  =  *i,lFl  »  for  0  <  z  <  91  ,  1  <  i  <  c  ,  (3.51) 

in  (3.48),  expand  the  binomials  and  integrate.  This  yields 


Vui 


Qc(u)  =  K  l  . 


V° 


Vuc 


2  v 
i=l  1 


vc=o 


u^+v 

*1.1 


t!nrV'i)*j,i 


-1 


(3.52) 


Following  a  similar  development,  (3-49)  can  be  simplified,  if  we 
note  that  (3*49)  will  be  zero  unless  (n^-u^)  =  0,  1  <  i  <  t  since 
^  =  I*  i  =  1,  2,  ...,  t  .  Thus  for  1  <  t  <  c,  (3.49)  reduces  to 


»t<“) 


c 

=  £(nrUj)K 

j=t+l 


/ 

(f 


t 

n  4n 

1=1  ni 


-U 


i 


[Fi1(i-Fl)”1’“iJ }  • 


[1-FjrldFj  . 


(3.53) 
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p  M 


then  (3.47)  reduces  to 


"c^c  nc-uc  /-  -\  .  . 

T’tv2  V"c)  V"cU “cw2 


VVV  =  I  E  E<-«  1  V,  7U  °)(Ve) 

tr  “H  1*  —A 


v^=0  v2=0 


[ (No*Uc"Vl^  nc“uc-v2)/0^  r+Vl+v2+l] '1  + 


n  -u 
c  c 


“h.u-,0 

C  C  ..  -A  V  1  / 


u  +v,+l 


v° 


,  (3.58) 

using  techniques  similar  to  those  used  in  the  general  case* 

Again  these  results  can  be  used  to  compute  the  power  of  the 
test  by  first  defining  t  by 

Pr{  T  >  t^  |  Hq  }  <  c<  , 

then  the  power  is  computed  from  Pr  (  T  >  t  |  )  = 

q>a(ui»  •  ••>  uc)  ^at  T  >  t  . 

i 
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3*6  Results, 

The  exact  power  of  the  T  statistic  has  been  computed  for 
the  case  of  three  samples  using  various  exponential  alternatives. 
These  results  are  presented  in  Table  3.1  for  total  sample  sizes  of 
11  and  15.  Since  the  computations  were  performed  in  single  preci¬ 
sion  arithmetic,  the  values  computed  for  the  sample  of  size  15  may 
have  errors  as  large  as  +  2  in  the  third  decimal  place  as  indicated 
by  the  cumulative  sum. 

The  power  of  the  test,  in  general,  increases  with  a  positive 
shift  in  the  locations,  especially  when  the  test  is  unbiased.  How¬ 
ever,  several  cases  can  be  noted  in  which  this  trend  fails  to  occur 
Indicating  that  in  these  cases  the  test  is  biased.  (For  example,  see 
Table  3.1  for  n  =  11,  ^  =5,  =  4,  and  c<  =  0.0476.)  This  ef¬ 

fect  seems  to  occur  frequently  when  c<  is  very  small.  An  analysis 
of  the  complete  distribution  of  the  T  statistic  indicates  that  the 
actual  distribution  becomes  highly  "peaked"  in  addition  to  shifting 
in  the  positive  direction  when  the  location  parameters  increase. 

This  means  that  the  actual  tail  area  for  small  <=<’  s  can  decrease 
even  though  the  distribution  is  shifting  in  the  positive  direction. 

3.7  Further  Extensions. 

The  computational  results  presented  in  Table  3*1  can  be  ex¬ 
tended  to  larger  sample  sizes  and  to  other  combinations  of  n^,  n^, 

and  n~.  The  accuracy  of  the  results  can  also  be  improved  by  per- 
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forming  all  of  the  computations  in  double  precision  arithmetic. 
Similar  tables  can  be  computed  with  rectangular  alternatives.  The 
results  can  also  be  extended  to  more  than  three  samples  when  more 
efficient  computational  equipment  is  available. 

Similar  expressions  for  the  power  of  the  test  can  be  de¬ 
rived  for  a  set  of  exponential  alternatives  with  a  change  in  scale. 
Also,  extensions  similar  to  those  indicated  in  Chapter  I  may  be  ap¬ 
plied  to  this  test. 
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O  cv  m  O' 
O  O  O  O 

•  •  •  • 

.0181 

.0660 

.1284 

onnl 

O  O  O  H 

•  •  *  • 

.0032 

.0127 

.0806 

.1388 

.0118 

.0257 

.0798 

*  *  * 

-4  O'  c*-- 
O  oi 

OHino 

O  O  O  rH 

•  •  •  • 

in  tv  vO  vO 

O  O  O  rH 

•  •  •  « 

.0130 

.0563 

.1082 

in  cv  vO  vO 
vO  m  r-  cv 
OH  sfrH 

O  O  O  rH 

•  •  •  • 

*4  cv  0  n 
rH  rH  0s  m 
O  rH  in  rH 
O  O  O  H 

•  •  •  • 

.0093 

.0210 

.0675 

cv  0  m 
rH  rH  0s  on 
O  rH  in  rH 
O  O  O  rH 

•  •  •  • 

cn  m  o  to 
to  >  vO 

0^  0^  H  c- 

H  C"-  vO 
HvOtO 
mooi 

cn  m  0  to 

00  i>  vO  -4- 
O'  O'  rH  c- 

vO  to  m  to 
to  {>  nO  0 
r-  co  vO  to 

O'  rH  -t 

vO  to  cn  to 
to  r-  0 
r-  co  vO  co 

to  r-  vO 

to  NO  *n 

tO  d  d  -t 

rH  to  in  d 

d  to  d 

h  co  m  v 

indicate  that  the  test  is  biased 


CHAPTER  IV 


EXACT  POWER  OF  SOME  TESTS  BASED  ON  A  GENERALIZATION  OF 
MASSEY* S  STATISTIC 


4.1  Introduction. 

The  exact  power  of  some  tests  based  on  Massey*  s  statistic 
for  the  case  of  two  samples  has  been  investigated  by  Chakravarti, 
Leone,  and  Alanen  [13].  The  purpose  of  the  investigation  in  this 
chapter  is  to  extend  their  results  to  the  case  of  discriminating 
between  c  populations  on  the  basis  of  c  ordered  samples. 


4.2  The  c  Sample  Problem. 

Let  [X1^i\  X2^,  ...»  Xn  ^ 


^  for  i  1,  2,  •  * •  ,  c  be  c 


sets  of  independently  distributed  random  variables  with  continuous 
cumulative  distribution  functions  F^,  respectively.  We  wish  to 
test  the  hypothesis 

HQ  :  F1(x)  =  Fg(x)  =  ...  =  Fc(x)  , 
against  the  alternative 


Ha  :  Fx(x)  >  F2(x)  >  ...  >  Fc(x) 

th  c 

Let  n.  denote  the  size  of  the  i  sample,  and  n  =  Z  n,  , 

1=1 

the  size  of  the  combined  sample.  For  simplicity,  we  will  assume  that 
n  =  4r  +  1,  where  r  is  an  integer.  Also  let  and  Zp  denote 

respectively  the  first  quartile  and  median  of  the  combined  sample. 

Let  U,  ^  and  Up  ^  denote  respectively  the  number  of  observations 

in  the  i^  sample  less  than  Z^  and  the  number  of  observations 
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"th 

in  the  i  sample  that  are  greater  than  or  equal  to  but  less 
than  i-1,  2,  c.  The  results  for  the  combined  sample 

can  be  arranged  in  a  3  by  c  contingency  table  showing  U.  ,  the 

number  of  values  from  the  ith  sample  in  the  tth  interval 

U  “  2,  ...»  c;  t  =  1,  2,  3).  This  table  will  have  the  follow¬ 

ing  form: 


Number  •>?  observations  less  than  the  first  quartile 
and  between  the  first  quartile  and  the  median. 


Intervals 

1st  Sample 

2n ^  Sample 

■ 

th  0 

c  Sample 

1*  x  <  Z^ 

ui,i 

Ul,2 

■ 

ul,c 

2.  Z1  <  x  <  Z2 

U2,l 

U2,2 

8 

u2,c 

3.  x>Z2 

U3,l 

U3,2 

I 

U* 

3>  c 

Total 

nl 

n2 

■ 

n 

c 

Total 


S^r 


S2  =  r 


S3=2r  +  1 


where 


U3  ^  ^  “■  1^2  ^  y  i  —  1  ^  2^  0  ^  m  y  2  y 

To  test  the  null  hypothesis 

Ho  :  Fi(x)  =  f2(x)  =  •••  =  Fc(x)  , 

the  usual  chi-square  statistic,  T,  based  on  the  set  {  ut  ±  }  may 
be  used.  We  reject  Hq  for  large  values  of  T,  where  the  statistic 
T  is  defined  as  follows: 
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T  = 


e  3 


I  If[uM 


(niSt/n)]2/(niSt/n)} 


(4.1) 


The  test  rule  based  on  the  statistic  T  is: 

reject  Hq  if  T  >  t  , 

accept  Hq  if  T  <  t^  , 

where  t^  is  chosen  so  that  Pr{  T  >  t^  |  Hq  }  <  X  ,  and  <X  is  the 
preassigned  level  of  significance. 


4« 3  The  Null  Distribution. 

Pij^t,i=  ut,i^’  2l  =  ZV  ^2  =  zt)  ^enote  t^ie  Joint 

probability  density  of  ^},  and  Zg,  when  Z^  belongs  to 

th  i'H 

the  i  sample,  and  Z2  belongs  to  the  jth  sample,  i,j  =  l,  ...,c. 

Then  the  expression  for  P.  .  is  given  by: 

±9  J 


i»  j 


U3,«j. 


_x  dF1(81)  dF^Zg) 


■i  1  or .  j  or  ,  ( 


(4.2) 


Hence,  the  joint  density  of  the  Ui  ^'s,  Z1  and  Z2  is  given  by 


c  c 

h(ui  1F  Z1F  z7>  =  2  Z  P*  < 

i,J  1  2  i=l  j=l  ^  • 


(4.3) 
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The  null  distribution  cp^u^)  of  the  U^'s,  (i,j  =  1,  2,  c), 

under  the  null  hypothesis  is  derived  from  h(u.  . ,z,,z0)  by  substi- 


tuting 


F1(z)  =  F2(z)  =  ...  =  Fc(z)  =  F(z) 


in  (4«3)  and  integrating  the  resulting  expression  over  the  range  of 
z^  and  z2.  This  yields 


‘Po(ui,j)=  JJ  h^ui, j ,zl,z2)  dzl 


-®<z^<z2<« 


where 


1  r  r-l  2r 

=  Kr(2r+l)J  J  [F(Z;l)]  [F(z2)-F(Zi)]  [1-F(z2)J  dF(Z;L)dF(z2)  , 

(4.4) 

K  =  Tt  (  n®  ^ 
m=l  \  l,m’  u2,m / 


Letting  Q1  -  F(z^)/F(z2)  in  the  innermost  integral,  (4.4)  can  be 
integrated.  This  yields: 


«P0<ui, -j)  =  Kr(2r+1)  B(r+1,  r)  B(2r+1,  2r+l)  , 


where  B(i,j)  =  [(i+j-1)'.]  [(i-1) '.(j-l) '.] 

This  expression  can  be  rewritten  in  the  following  form: 


■T . 


(4.6) 
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For  the  special  case  in  which  c  =  2,  (4*6)  reduces  to 


(4»7) 


which  agrees  with  the  result  obtained  by  Chakravarti,  Leone,  and 

Alanen  [13].  It  should  be  noted  that  the  statistic  T  defined  by 
(4*1)  under  the  null  hypothesis  is  distributed  approximately  as 

1  __  i 

chi-square  with  2(c-l)  degrees  of  freedom.  However,  the  exact  dis¬ 
tribution  may  be  calculated  from  (4.6)  and  (4.1). 


4*4  Power  Function _of  T  Test  Against  the  Alternative  of  Translation 


Let  cp  (u,  .)  denote  the  probability  Prf  U.  ,  =  u.  ,  , 

a  lfj  v  t,I 

t  =  1,  2,  3,  i  =  1,  2,  ...,  c]  ,  when  the  alternative  hypothesis 

H  is  true.  Then  cp  (u.  ,)  is  given  by: 
a  a  i,J 

^a(ui, =  J  J  h(ui,j>  ZV  z2}  dzl  dz2  ' 

-®<21<z2<®  (4.9) 
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Q 


3,3 


C  C  Vl  z2 

«  l  l  J  J  ' 

i*l  j  =1  z_=a„  z,=a„ 

*  <C  3  1  3 


and 


i.j 


m-l 


)] 


‘l,m, 


tVz2>  - 


VZ1>J 


u2,nf6i,m 


ll-Wl  3’"  J’“)  “2,1  -J.J 


with  ac+l  =  “  * 

In  Qg  t,  Pi(z1)  =0  for  t+1  <  i  <  c  and  Fj  (z2)  =  0  for 

s+i  <  j  <  c  .  Hence,  for  t+1  <  i  <  c  and  3+1  <  j  <  c,  Q  t  will 

be  zero  unless  u.  .  =  0  and  u_  .  =  0.  Thus,  if  we  define 

1,1 

ux  c+1  =  u2,c+i  *  °»  (4«12)  and  (4.13)  reduce  to 


‘s,t 


K(mn6ul,tt+1,°]  [m=s6u2,m+l,0]i^L 


.3+1  “t+1 


J  J 


Z2=a3 


■l  "t 


-6_  ,  3 


”{F.<»1>  1’”tFm(z2)-f'm(al”  2’“  ”  tF«<“2>  2’“1 

m=l  a=t+l 


Un  -6 


m^(U-Fm(.2)]  3’“  "’J)  <*Fl<.l)  Ofj^)  , 


(4.14) 
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In  (4.14) .  This  give  us 


Qs,t  =  K 


11  6u  0 

m=t  ul,m+l,u 


m=s  u2,m+l’^ 


2  u 

i=l  Z>1 


-S+l  ptj.t+lj't-l  u.  \  u,  . 

f_  [  ’  )W  ’ 


Z=8l 

2  s 


[Ft(.a)-rt(. 


?1u2,m-1f  s 

1J)  U 

U=t+i 


Fm(z2) 


Mu. 


/  3»J 


{ji[l-Fm(22)],'3-"'‘"-l}  dFt(h)  dF^)  _ 


(4.17) 


Assuming  that  t  >  1,  we  can  expand  [y  .+  Tl  .F.(z,)]  1,0  fbr 

m  $  t  iHjt  t  i 


m?iU2,m-1 

1  <  m  <  t,  -Ft(z^)  ]  9  an^  Per^onn  innermost 


integration 
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Starting  from  (4.17)  the  results  for  the  cases  in  which  s  =  t+1 
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and  t  -  1  can  be  obtained  in  a  similar  fashion  and  are  summarized 
below: 


u,  ,  u. 


Qt+i,t  =  Ki 


ns  n 

m=t  ul,m+l,U 


‘  c 

IT  fi  * 

m=tfl  u2,m+l,U 


1,1  l,t-l  ”2,t 

l-  l  l 

^i=o  ■’t-r0  "=0 


A2,t“w  A3,t+1 

J  E(-» 

qt=0  v=0 


v+v 


A3,t+l]  fA2,t*w' 


(A3,t+l"v) 


ft-1 

(A2,t-w^t}  \ 

lm-1 


_u2,m+Sn  ul,nT^m|  1, 
m,t  Ym,t 


t-1 

un  . +A0  >  2  q  -q,  .  , 

!,t  2,t  m=14m  ->t|  t  u 


Yt,t+1 


t  u,  \  q  t-1  -1 

\,t+lj  \,t+l£ul,t+  2.  qmHw+1^ 
^m-1  ;  rn=l 


U2,t+l^t+v+1r  .  +  +11-1 

Yt+l,t+-2  ^u2,t+-l+qt+v' +1^  , 


(4.21) 
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Q.,l  =  K 


[m^l  Ul,m+l,0J 


TI  6  n 

Lm=s  u2,m+l,U. 


U2,l  U2,1"W  "2,2  u2,s-l 

I  l  l-  l 

v=0  q1=0  q2=0  qs_j=0 


3,s 


i  <-« 


v+w 


v=0 


‘•-!i \rK~vw 


3-1  U„  +Q 


m: 


i.  Vr’NI  ? 

i=l  ^'s  >\m=2 


U0 

2,m 

Tm,s 


vi, 


s— 1 

u0  4*  2  q  +v+l  _ 

I  i  2 ,  s  ,13  3-1  , 

yl,2  Iul,lWl)  vS,S.l  '  t“2,s*  'r*1! 

Ill— 1 


ul,l+w+1 


(4.22) 


^2,1  K 


71  6u  0 

m=l  l,m+l,U 


TT  6 

m=2  u2,m+l’U 


u2,l  u2,l~w  ^3,2 


Z  Z  Z  . 


w=0  q^=0  v=0 


U2  > 1 }  [  3  >  2 ] \2 , l-w 


“3,1%. 


w7l  v  A  ql  )  (u2,rW“ql)(A3,2_v^M  ^ul,l4v+1^ 


Ul,l+U2,l-ql  u2,2+ql+V+1 

%2 


v2,r  ^u2,2+qi+v+i^ 1 . 


(4.23) 
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New,  if  we  let  L  -  Fc'z ^  in  the  innerr.c-st.  integral 
and  integrate,  we  :htr*;r. 


; ^ 


f 


\ 
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~  c  1 

Ul,l 

U1  - 

l,s-l 

Q  =  K 

s,s 

m=s  ul,m+l*°  u2,m+l,0J 

L  - 

l 

q^O 

q  .=0 
^s-1 

(4.26) 


or  rewriting  the  Beta  function  in  terms  of  factorials  one  obtains 
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+u. 


3-1 


(4.27) 


For  the  case  in  which  s  =  1,  (4.27)  becomes 


Qi,i  =  K 


c 
IT  6 
m=l 


Ul,m+1>0  u2,m+l’° 


U1,1IU2,11[(U1,1^,1)1] 


,-1 


U3,l 

I  (-DV 


v=0 


U2,l+Ul,l+V+1 


(4.28) 
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Also  in  the  case  s  =  c,  (4.20)  and  (4.27)  can  be  simplified 
since  the  outermost  integral  becomes  a  complete  Beta  function. 

These  results  are  summarized  below. 


fc  nUl,l  ul,t-l  A2,t  A2,fw  u2,t+l  u2 , c-1 

-uljmtl,o  E-  III  E-  E  <-1)U  • 

J,l=0  Vl=0“=°  qt.l=0  qc-l'° 


«c,t=K 


A2,t  A2,t'W 


t_w\  (t-lf  u_  +q  u,  -q  Ai-i  \1 

Bvm  r  m  H  • 


m,c  Tm,c  \  q  /  (  } 
\  m  /  }  (. 


TT 

m=t+l 


m  /  J  >  l  m=l 


n  v;  v. 


t-i 

u,  2  < 


ui,t*  ?.v“+i  ^ 


E  q,h'ti!'1  (2rtl)'f2>c4‘ E  0-  • 

m=l  \  m=t  / 


(4.29) 
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ul,l  ul,c-2 

I-  E 


v0  v2=0 


w=0 


A_  .  +u-  ,  +■  2  q_-q  , 

2 , c-1  l,c-l  „£i>^c-l 

Yc-l,c 


;-2 


m=l 


(4.30) 


ul,l  "1,0-1 


k  f" ...  V  /n1 

.  ^Ux  L'*’c  *’°  vwJj 


qr°  %-r° 


c-1  c-1 

r l(2r*l)l L  t c*  £  qj [ (uj _ c*3r*  £  V« '•  1 ' 

m-1  m=l 


(4.31) 


4*5  Results* 

The  exact  power  of  the  T  statistic  has  been  computed  for 
three  samples  using  several  sets  of  exponential  alternatives.  These 
results  are  presented  in  Table  4«1  for  total  sample  sizes  of  9  and 
13 •  Since  the  computations  were  performed  in  single  precision 
arithmetic,  the  error  in  the  computations  is  rather  severe,  and  at 
times,  amounts  to  +  1  in  the  second  decimal  place  as  indicated  by 
the  cumulative  sum. 

As  in  the  case  of  Mood!s  test,  the  power,  in  most  cases,  in¬ 
creases  with  a  positive  shift  in  the  location  parameters,  but,  again 
several  cases  can  be  noted  in  which  the  increase  does  not  occur. 

This  indicates  that  the  test  is  biased  under  these  circumstances. 

As  stated  in  Chapter  III,  this  effect  can  be  attributed  to  a  sharp 
"peaking'*  in  the  shape  of  the  distribution  of  the  T  statistic. 

4.6  Further  Extensions. 

Expressions  similar  to  those  developed  in  this  chapter  can 
be  derived  for  a  change  in  scale  and  location  of  a  rectangular  dis¬ 
tribution  and  a  change  in  scale  of  an  exponential  distribution.  The 
results  presented  in  Table  4*1  can  be  extended  to  larger  samples  and 
other  alternatives.  The  accuracy  of  the  computations  can  be  im¬ 
proved  by  using  double  precision  arithmetic.  Also,  some  extensions 
similar  to  those  indicated  in  Chapter  I  can  be  applied  to  this  test. 


TABLE  4.1 


Exact  Power  of  Massey' s  Test  for  Three  Samples  with 


Exponential  Alternatives: 


°1 

”2 

t 

<* 

2 

3 

13.800 

•  008 

11.700 

•024 

9.075 

•056 

8.850 

.079 

7.500 

.127 

3 

3 

9.600 

.071 

4 

3 

13.800 

.008 

11.700 

.024 

9.075 

.056 

8.850 

.079 

7.500 

.127 

2 

4 

16.050 

.002 

11.718 

.007 

11.099 

.019 

9.728 

.054 

8.932 

.073 

8.269 

.112 

4 

4 

16.714 

.001 

13.929 

.006 

12.071 

.013 

9.657 

.055 

8.976 

.078 

7.738 

.135 

7 

4 

16.050 

.002 

11.718 

.007 

11.099 

.019 

9.728 

.054 

8.932 

.073 

8.269 

.112 

Pr(  T  >  t 

1  « 

•1  =  0 

*1=  ° 

*1=° 

-  .1 

*2 : 

*2 : ,2, 

a3  ~  *2 

“  *5 

"  *5 

.016 

.043 

•  044 

.039 

.063 

.071 

.091 

.128 

.154 

.119 

.197 

.204 

.159 

.237 

.238 

.084 

.110 

.115 

.004* 

.002* 

.002* 

.020* 

.010* 

.012* 

.042* 

.022* 

.025* 

.070* 

.052* 

.070* 

.135 

.149 

.166 

.004 

.012 

.008 

.015 

.053 

.046 

.030 

.072 

.063 

.080 

.150 

.139 

.102 

.205 

.178 

.163 

.294 

.280 

.002 

.009 

.010 

.008 

.015 

..017 

.021 

.047 

.050 

.070 

.111 

.127 

.104 

.176 

.189 

.179 

.312 

.306 

.001* 

.000* 

.000* 

.004* 

.002* 

.002* 

.014* 

.007* 

.008* 

.048* 

.041* 

.043* 

.070* 

.061* 

.081 

.100* 

.076* 

.097* 

•These  values  indicate  a  bias  in  the  test 


CHAPTER  V 


ANALYSIS  OF  CATEGORICAL  DATA 


5.1  Introduction, 

The  usual  method  of  testing  a  hypothesis  concerning  cate¬ 
gorical  data  is  to  compute  a  statistic  T,  which  is  distributed 
2 

approximately  as  x  •  Then  the  hypothesis  is  accepted  or  rejected 
depending  upon  the  relationship  of  the  observed  value  of  T  to  a 
predetermined  critical  value  obtained  from  the  x  distribution. 
Since  T  is  only  approximately  distributed  as  x  >  the  exact  dis¬ 
tribution  of  T  under  several  different  null  hypotheses  has  been 
computed  for  both  one-way  and  two-way  classifications,  and  these  re- 
suits  have  been  compared  with  those  obtained  from  the  x  approxi¬ 
mation,  in  order  to  determine  when  the  approximation  is  valid.  Also 
the  exact  power  of  the  T  test  has  been  computed  for  a  one-way 
classification  with  fixed  alternatives,  and  these  results  compare 

o 

favorably  with  those  obtained  from  a  non-central  x  approximation. 


Extensive  tables  of  the  non-central  chi-square  distribution 


given  by 


F(\,v,y)  =  f 


~2  ~2  “  v/2+‘3"1  < 
e---e  lX 


J 


4  r(v/2+J)22Jji 


dx  , 


where  \  denotes  the  non-centrality  parameter  and  v  the  degrees 
of  freedom,  have  been  computed  at  Case  Institute  of  Technology  [81. 
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The  computational  algorithm  suggested  by  Professor  N.L.  Johnson  [lo] 
reduces  the  integral  to  a  double  Poisson  sum: 


,  e-x  J-£ 

Qj(x)  '  nj  +  *) 


i  >  i 


5.2  One-way  Classification.  ** 

In  this  case  the  set  of  n  observations  is  partitioned  into 
k  cells  (  |  i  =  1,  2,  k  }  with  n^  observations  in  cell 

.  We  will  consider  the  null  hypothesis  given  by 

k 

H0  :  (  |  i  =  1,  2,  k  ]  such  that  2  tt^  =  1  , 

1=1  (5.1) 


where  rr^  represents  the  probability  that  an  observation  will  fall 
in  cell  Ai.  The  usual  statistic  T  is  defined  in  this  case  by 
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k 

T  =  Yj  (ni  "  nTTi^2 

i=l 


(5.2) 


The  observed  frequencies,  n^,  will  be  distributed  as  the  multi¬ 
nomial  distribution 


<p0(ni,  n^)  -  (n».)  niy  ni  ^  . 

Hence  the  exact  distribution  of  T  can  be  computed  from 


(5.3) 


Pr{  T  >  t  |  Hq  }  =  £  cpQ(n^,  . n^) ,  such  that  T  >  t., 


(5.4) 


where  t  is  a  fixed  value  of  T.  The  results  for  the  exact  distri¬ 
bution  of  T  as  a  function  of  the  sample  size,  n,  are  displayed 
in  figures  (5.1)  and  (5.2)  for  two  different  null  hypotheses.  The 
approximating  chi-square  distribution  follows  the  exact  distribution 
very  closely  even  for  small  values  of  n. 

The  exact  power  of  the  test  with  the  significance  level  at 
o(,  can  be  evaluated  by  considering  an  alternative  hypothesis  such 
as: 

k 

Ha  •  {Pj_  I  i  =  1*  2,  ...»  k  }  ,  where  Zp^  =  l  ^  (55) 

Then  the  power  of  the  test  against  this  alternative  will  be  given  by 
Pr[  T  >  |  Hfl  }  =^'Pa(n1>  •••>  \)>  such  that  T  >  , 

ni  (5.6) 


T 


V 
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where  <p  is  defined  by 


*Pa(n^»  *••> 


(5.7) 


and  t  ib  defined  implicitly  by 

M  T  >  V  I H0 1  <  °< 


(5.8) 


If  we  let  X  be  a  x  variable  with  k-1  degrees  of  free¬ 
dom,  then  T  will  be  approximately  distributed  as  X  [  4]. 

Patnaik  [ 19]  has  shown  that  the  distribution  of  T  under  the  alter- 
native  hypothesis  can  be  approximated  by  the  non-central  >  dis¬ 
tribution.  This  fact  when  applied  to  (5.6)  yields: 


Pr  {T  >  Xc<  |  Hj^J  f(x2)  d(x2)  , 


(5.9) 


where 


f(x2)  =  e-k  [(x2)^V+j_1Xj[r(iv+j)22Jj’.]-1]  , 

j=0 


is  the  non-central  chi-square  distribution  with  v  degrees  of 
freedom  and  \  is  the  non-centrality  parameter*  In  our  case 

k 

v=k-l,  X  =  n  I  (Pi  -  "1)2("1)-1  .  (5.10) 

i=l 

This  function  has  been  extensively  tabulated  in  [  8  ].  Also,  Patnaik 
[19]  has  shown  that  the  non-central  distribution  can  be  approxi- 
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mated  by  a  central  x  distribution  by  equating  the  first  two 
moments  of  the  distributions.  This  yields  a  further  approximation 
to  (5.6),  namely: 


where 


a* 

Pr£  T  >  Xo<  )  ~  J  S(y)  dy  , 

x=/p 

g(y)  =  e~^y  (y)^v  _1  2~^  [rt^v')]"1  ,  and 


(5.11) 


y  -  xVp>  P  _  (v+2X)(v+X)  \  v'  =  (v+X)^(v+2X)~^  . 

Tables  (5.1)  and  (5.2)  compare  these  results  of  equations 
(5.6),  (5.9),  and  (5.11)  for  two  different  sets  of  hypotheses.  The 
approximations  to  the  exact  power  given  by  (5.9)  and  (5.11)  are  close 
to  each  other.  They  both  overestimate  the  exact  power  of  T  for 
n  <  20  in  example  1  presented  in  Table  5.1  and  for  n  <  15  in  ex¬ 
ample  2  presented  in  Table  5.1.  However,  for  n  =  20  in  example  2, 
they  underestimate  the  exact  power. 


5.3  Two-way  Classification. 

In  this  case,  the  set  of  observations  form  a  two-way  con- 
tingency  table  in  which  three  different  subcases  can  be  distinguished, 


namely: 


(i)  Neither  set  of  marginal  sums  fixed. 

(ii)  One  set  of  marginal  sums  fixed. 

(iii)  Both  sets  of  marginal  sums  fixed. 


Ik.  \  'V 


In  each  of  the  subcases,  the  expression  for  the  test  statistic  T  is 
T  =  i=l  ~  ni«n-/n)2^ni-n./n)  ,  (5.12) 


where  r  denotes  the  number  of  rows,  s  the  number  of  columns, 
n^  the  frequency  in  the  ij  cell,  m  the  marginal  i  row 

sum,  and  n  .  the  marginal  column  sum.  Also,  n  denotes  the 

•  J 

grand  total. 

In  order  to  obtain  the  null  distribution  of  T,  it  will  be 
necessary  to  consider  each  of  the  subcases  separately. 

5-3*1  Neither  Sets  of  Marginal  Sums  Fixed. 


In  this  subcase,  we  will  consider  the  following  null  hypo¬ 


thesis: 


H0  5  f  Pij  pi.p.j  I  1  1>  •••»  r  J  J  -  !»  •••>  8  ]  , 

(5.13) 

where  p . .  is  the  probability  that  an  observation  will  fall  in  the 

•*-  j 

th 

ij  cell,  p.  are  the  marginal  row  probabilities,  and  p  .  the 

•  J 

marginal  column  probabilities.  Then 


^Pi  ~  2p,  -  1 

i=l  #  J=1  ° 


(5.14) 


The  observed  cell  frequencies,  n^,  will  be  distributed  under  the 
null  hypothesis  as: 


cp  (n  .)  =  nl  ir  IT  n. 

J  i=l  j=l  1J- 


r  s  1-lr  r  s 


r  s  n.  ,1 

TT  TT  (p.  p  .)  1J 

1=1  .1=1  J  -J  ’ 


(5.15) 


'  .  /  ,  l\y 

-  “7 ‘  v 

>  'V  'JL  ^  .  ' 
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where  <pQ(n^j)  will  denote  the  probability  of  observing  the 
n^j(i=l,  2,  r  ;  j  =  1,  2,  s)  ,  and  the  null  distribution 

of  T  will  be  given  by 

Pr{  T  >  t^  |  Hq  }  =  2  that  T  >  t^ 

nij  (5.16) 

The  results  for  the  cases  a)  r  =  2,  s  =  3  and  b)r  =  3,  s-3 
are  summarized  in  figures  (5.3)  and  (5.4).  Since  T  is  asymptoti- 
cally  distributed  as  Hie  central  x  distribution  with  (r  -  l)(s-l) 
degrees  of  freedom,  this  curve  is  also  plotted  on  the  graphs  for 
comparison  purposes.  It  should  be  noted  that  the  exact  distribution 
of  T  is  fairly  well  approximated  by  the  distribution  for 

relatively  small  sample  sizes. 

5.3.2  Only  Row  Marginal  Sums  Fixed. 

For  this  subcase  the  null  hypothesis  becomes 

~P^j  |  j  *  2,  •««,  s]  ,  with  (5.17) 

s  s 

2  P  ,  =  1,  2  n. .  =  n.  (fixed)  .  (5.18) 

j=l  °  j=l 


The  distribution  of  the  observed  cell  frequencies, 
null  hypothesis  is: 


V 


under  the 


n 

i=i 


(5.19) 
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and  the  null  distribution  of  T  will  again  be  given  by  (5*16) . 

Also  T  is  asymptotically  distributed  as  with  (r-l)(s-*l) degrees 
of  freedom*  These  results  are  displayed  for  several  different  sets 
of  marginal  sums,  in  figures  (5.5)  and  (5.6). 


5.3*3  Both  Sets  of  Marginal  Sums  Fixed. 

It  has  been  shown  by  Mood  [16]  that  the  distribution  of  the 
cell  frequencies,  n^,  does  not  depend  upon  the  cell  probabilities, 
namely,  p^,  but  is  dependent  only  on  the  fixed  marginal  sums. 

The  distribution  of  the  cell  frequencies  is  given  by  the  hyper¬ 
geometric  distribution: 


*Rij ! 


■  r 

f  s  nr 

/  r 

s  \-J-l 

fixed)  = 

TTn.  '. 

Li=l  i#  J 

Ia”-4 

ill  IT 
\i=l 

M  • 

(5.20) 


where 


r  a 

2  n. .  -  n  (fixed)  ,  2  n. .  =  n.  (fixed) 

XJ  *J  ij  l. 


(5.21) 


The  exact  distribution  of  T  is  given  by  (5.16),  ar.d  T  is 

2 

asymptotically  distributed  as  x  with  (r-l)(s-l)  degrees  of 
freedom.  Typical  results  are  given  in  figures  (5.7)  and  (5.8)  for 
several  different  sets  of  marginal  sums. 


5.4  Further  Extensions. 

The  exact  power  of  the  T  test  for  the  case  of  a  two-way 
classification  has  yet  to  be  investigated.  Also,  both  the  null 


92 

distribution  and  power  calculations  ran  be  done  for  other  classifi¬ 
cations  such  as  a  three-way  table «  However,  present  computing 
equipment  is  inadequate  for  extending  most  of  the  above  results, 
since  even  a  3x3  contingency  table  with  neither  margins  fixed 
requires  a  considerable  amount  of  computing  time  on  an  IBM  7090. 

The  actual  number  of  combinations  that  were  investigated  for  a 
sample  size  of  15  was  490,314,  and  this  number  increases  rapidly 


with  n. 


TABLE  5.1 

Power  of  the  T  Teat  for  a  One-way  Classification 
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Flfur*  5*1  Exact  null  distribution  for  a  ona-vay  classification. 

H^t  {  Pj  3  l/i  i  1  3  1*  2,  3,  4  )  3  dafraas  of  frsado«. 
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FlgL.ro  3.2  tiMi  anil  distribution  for  a  ono-oay  classification. 

I  Pi  '  V*.  Pj  *  3/1*.  Pj  *  Vi.  P4  *  3/16  ) 

3  dsgraso  of  froodaa. 


/ 


* 


FI  fur*  5.5  Exact  null  distribution  for  a  3x3  contii.f*ncy  tabl*. 

V  t  3  V3.  J  *  1.  2.  3  ) 

4  d*fra*a  of  fra*do«  and  row  sar final  tuna  ar«  fitted. 
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_ Cbi^quart 


Q  Q  San ole  of  10 

HI  *  5 

o  _  .  -  -O  Sample  of  15 
Rl  »  7 


FI fur*  5.6  Exact  mill  distribution  for  a  2x3  eon fr infancy  table. 

H0»  {  P.j  -  VJ>  J  -  1>  2.  1  1 

2  dofreee  of  froodoa  and  row  narflnal  eur>a  are  fixed. 
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